TDPL reaches Thermopylae level
Chris Nicholson-Sauls
ibisbasenji at gmail.com
Thu Oct 29 13:29:51 PDT 2009
Justin Johansson wrote:
> Chris Nicholson-Sauls Wrote:
>
>> Andrei Alexandrescu wrote:
>>> Bill Baxter wrote:
>>>> On Mon, Oct 26, 2009 at 11:51 AM, Andrei Alexandrescu
>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>> Bill Baxter wrote:
>>>>>> On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier <jeremiep at gmail.com>
>>>>>> wrote:
>>>>>>> Andrei Alexandrescu wrote:
>>>>>>>> 303 pages and counting!
>>>>>>>>
>>>>>>>> Andrei
>>>>>>> Soon the PI level, or at least 10 times PI!
>>>>>>>
>>>>>> A hundred even. ;-)
>>>>> Coming along. I'm writing about strings and Unicode right now. I was
>>>>> wondering what people think about allowing concatenation (with ~ and
>>>>> ~=) of
>>>>> strings of different character widths. The support library could do
>>>>> all of
>>>>> the transcoding.
>>>>>
>>>>> (I understand that concatenating an array of wchar or char with a
>>>>> dchar is
>>>>> already in bugzilla.)
>>>> So a common way to convert wchar to char might then become
>>>> ""~myWcharString?
>>>>
>>>> That seems kind of odd.
>>> Well, I guess. In particular, to me it's not clear what type we should
>>> assign to a concatenation between a string and a wstring. With ~=, it's
>>> much easier...
>>>
>> My intuition would be to expect the same as adding an int to a byte: you get an int.
>> Concatenating a string and a wstring should yield a wstring; ie, encode to the wider of
>> the two types.
>>
>> -- Chris Nicholson-Sauls
>
> Though I'm sure Shannon would say that the number of bits of intrinsic information
> contained in the same sequence of Unicode codepoints is exactly the same whether
> it be encoded as a string or a wstring. Accordingly my intuition is that some rule
> based upon left-to-right associativity would be more apt. You could then concatenate
> a wstring (on the rhs) to an empty string (on the lhs) to convert the wstring to a string
> or vica versa.
>
> Cheers
> Justin Johansson
>
Granted LTR is common enough to be expectable and acceptable. To be perfectly honest, I
don't believe I have *ever* even used wchar/wstring. Char/string gosh yes; dchar/dstring
quite a bit as well, where I need the simplicity; but I've yet to feel much need for the
"weirdo" middle child of UTF.
I would argue that string ~ wstring returning string is fine, but would suggest it be a
warning for those like myself who might have first guessed it would "upscale to fit".
Just so long as the foreach(dchar;string) trick is still around, char/string can cover an
awful lot of ground.
All that said, though, I don't think I would ever use ""~wstring as a means of conversion.
It just feels like "there wasn't any other way to do this, so here's a cheap hack" --
which just isn't the case.
-- Chris Nicholson-Sauls
More information about the Digitalmars-d
mailing list