Making all strings UTF ranges has some risk of WTF

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Feb 4 10:29:12 PST 2010


Michel Fortin wrote:
> On 2010-02-04 12:19:42 -0500, Andrei Alexandrescu 
> <SeeWebsiteForEmail at erdani.org> said:
> 
>> bearophile wrote:
>>> Simen kjaeraas:
>>>> Of the above, I feel (b) is the correct solution, and I understand
>>>> it has already been implemented in svn.
>>>
>>> Yes, I presume he was mostly looking for a justification of his ideas
>>> he has already accepted and even partially implemented :-)
>>
>> I am ready to throw away the implementation as soon as a better idea 
>> comes around. As other times, I operated the change to see how things 
>> feel with the new approach.
> 
> Has any thought been given to foreach? Currently all these work for 
> strings:
> 
>     foreach (c; "abc") { } // typeof(c) is 'char'
>     foreach (char c; "abc") { }
>     foreach (wchar c; "abc") { }
>     foreach (dchar c; "abc") { }
> 
> I'm concerned about the first case where the element type is implicit. 
> The implicit element type is (currently) the code units. If the range 
> use code points 'dchar' as the element type, then I think foreach needs 
> to be changed so that the default element type is 'dchar' too (in the 
> first line of my example). Having ranges and foreach disagree on this 
> would be very inconsistent. Of course you should be allowed to iterate 
> using 'char' and 'wchar' too.
> 
> I think this would fit nicely. I was surprised at first when learning D 
> and I noticed that foreach didn't do this, that I had to explicitly has 
> for it.

This is a good point. I'm in favor of changing the language to make the 
implicit type dchar.

Andrei



More information about the Digitalmars-d mailing list