TDPL reaches Thermopylae level

Mon Oct 26 16:45:56 PDT 2009

On Mon, Oct 26, 2009 at 4:05 PM, Jeremie Pelletier <jeremiep at gmail.com> wrote:
> Andrei Alexandrescu wrote:
>>
>> Jeremie Pelletier wrote:
>>>
>>> Andrei Alexandrescu wrote:
>>>>
>>>> Bill Baxter wrote:
>>>>>
>>>>> On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier <jeremiep at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Andrei Alexandrescu wrote:
>>>>>>>
>>>>>>> 303 pages and counting!
>>>>>>>
>>>>>>> Andrei
>>>>>>
>>>>>> Soon the PI level, or at least 10 times PI!
>>>>>>
>>>>>
>>>>> A hundred even. ;-)
>>>>
>>>> Coming along. I'm writing about strings and Unicode right now. I was
>>>> wondering what people think about allowing concatenation (with ~ and ~=) of
>>>> strings of different character widths. The support library could do all of
>>>> the transcoding.
>>>>
>>>> (I understand that concatenating an array of wchar or char with a dchar
>>>> is already in bugzilla.)
>>>>
>>>>
>>>> Andrei
>>>
>>> I don't know if thats a good idea, its better when string encoding is
>>> explicit so you know where your reallocations are.
>>
>> The beauty of it is that reallocation with ~ occurs anyway, and with ~= is
>> anyway imminent, regardless of the character width you're reallocating.
>>
>> Allowing concatenation of strings of different widths is a nice way of
>> acknowledging at the language level that all character widths are encodings
>> of abstract characters.
>>
>>> ie if I know some routine will have to convert a utf16 parameter to utf8
>>> to append it to a string, then ill try and either make it output utf16 or
>>> input utf8. If its implicit its much harder to find and optimize these
>>> cases.
>>>
>>> to!string() is easy enough to use anyways.
>>>
>>> But it could be good to add a range type that does this with multiple
>>> opAppend/opAppendAssign overloads.
>>
>> One problem with
>>
>> s ~= to!string(someDstring);
>>
>> is that it does two allocations instead of one.
>>
>>
>> Andrei
>
> Good points, I didn't think of the separation between characters and
> encodings or the extra allocation from to.
>
> You have my vote for this feature then!
>
> Jeremie
>

Yeh, me too.  Saving an allocation is good.  And I agree that having
~= do a conversion is much more useful than just getting an error.
Its one of those things you might try just hoping it will work, and
it's always nice when something like that does just what you hope it
will.

I guess the only other thing I could worry about is that in generic
array code it might cause someone headaches that for some T[],   T[]
~= S[] is legal and the length of the result is not the same as the
lengths of the inputs.  But I can't think of any real situation where
that would cause trouble.

--bb