First Impressions

Sean Kelly sean at f4.ca
Sun Oct 1 10:49:26 PDT 2006


Walter Bright wrote:
> Sean Kelly wrote:
>> Walter Bright wrote:
>>>
>>> Contrast that with C++, which has no usable or portable support for 
>>> UTF-8, UTF-16, or any Unicode. All your carefully coded use of 
>>> std::string needs to be totally scrapped and redone with your own 
>>> custom classes, should you decide your app needs to support unicode.
>>
>> As long as you're aware that you are working in UTF-8 I think 
>> std::string could still be used.  It just may be strange to use 
>> substring searches to find multibyte characters with no built-in 
>> support for dchar-type searching.
> 
> It's so broken that there are proposals to reengineer core C++ to add 
> support for UTF types.
> 
> 1) implementation-defined whether a char is signed or unsigned, so 
> you've got to cast the result of any string[i]

Oops, forgot about this.

> 2) none of the iteration, insertion, appending, etc., operations can 
> handle multibyte

True.  And I hinted at this above.

> 3) no UTF conversion or transliteration
> 
> 4) C++ source text encoding is implementation-defined, so no using UTF 
> characters in source code (have to use \u or \U notation)

Personally, I see this as a language deficiency more than a deficiency 
in std::string.  std::string is really just a vector with some search 
capabilities thrown in.  It's not that great for a string class, but it 
works well enough as a general sequence container.  And it will work a 
tad better once they impose the came data contiguity guarantee that 
vector has (I believe that's one of the issues set to be resolved for 0x).

Overall, I do agree with you.  Though I suppose that's obvious as I'm a 
former C++ advocate who now uses D quite a bit :-)


Sean



More information about the Digitalmars-d mailing list