TDPL reaches Thermopylae level

Chris Nicholson-Sauls ibisbasenji at gmail.com
Thu Oct 29 13:29:51 PDT 2009


Justin Johansson wrote:
> Chris Nicholson-Sauls Wrote:
> 
>> Andrei Alexandrescu wrote:
>>> Bill Baxter wrote:
>>>> On Mon, Oct 26, 2009 at 11:51 AM, Andrei Alexandrescu
>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>> Bill Baxter wrote:
>>>>>> On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier <jeremiep at gmail.com>
>>>>>> wrote:
>>>>>>> Andrei Alexandrescu wrote:
>>>>>>>> 303 pages and counting!
>>>>>>>>
>>>>>>>> Andrei
>>>>>>> Soon the PI level, or at least 10 times PI!
>>>>>>>
>>>>>> A hundred even. ;-)
>>>>> Coming along. I'm writing about strings and Unicode right now. I was
>>>>> wondering what people think about allowing concatenation (with ~ and 
>>>>> ~=) of
>>>>> strings of different character widths. The support library could do 
>>>>> all of
>>>>> the transcoding.
>>>>>
>>>>> (I understand that concatenating an array of wchar or char with a 
>>>>> dchar is
>>>>> already in bugzilla.)
>>>> So a common way to convert wchar to char might then become 
>>>> ""~myWcharString?
>>>>
>>>> That seems kind of odd.
>>> Well, I guess. In particular, to me it's not clear what type we should 
>>> assign to a concatenation between a string and a wstring. With ~=, it's 
>>> much easier...
>>>
>> My intuition would be to expect the same as adding an int to a byte: you get an int. 
>> Concatenating a string and a wstring should yield a wstring; ie, encode to the wider of 
>> the two types.
>>
>> -- Chris Nicholson-Sauls
> 
> Though I'm sure Shannon would say that the number of bits of intrinsic information
> contained in the same sequence of Unicode codepoints is exactly the same whether
> it be encoded as a string or a wstring.  Accordingly my intuition is that some rule
> based upon left-to-right associativity would be more apt.  You could then concatenate
> a wstring (on the rhs) to an empty string (on the lhs) to convert the wstring to a string
> or vica versa.
> 
> Cheers
> Justin Johansson
> 

Granted LTR is common enough to be expectable and acceptable.  To be perfectly honest, I 
don't believe I have *ever* even used wchar/wstring.  Char/string gosh yes; dchar/dstring 
quite a bit as well, where I need the simplicity; but I've yet to feel much need for the 
"weirdo" middle child of UTF.

I would argue that string ~ wstring returning string is fine, but would suggest it be a 
warning for those like myself who might have first guessed it would "upscale to fit". 
Just so long as the foreach(dchar;string) trick is still around, char/string can cover an 
awful lot of ground.

All that said, though, I don't think I would ever use ""~wstring as a means of conversion. 
  It just feels like "there wasn't any other way to do this, so here's a cheap hack" -- 
which just isn't the case.

-- Chris Nicholson-Sauls



More information about the Digitalmars-d mailing list