string is rarely useful as a function argument
Sean Kelly
sean at invisibleduck.org
Sat Dec 31 08:47:40 PST 2011
I don't know that Unicode expertise is really required here anyway. All one has to know is that UTF8 is a multibyte encoding and built-in string attributes talk in bytes. Knowing when one wants bytes vs characters isn't rocket science. That said, I'm on the fence about this change. It breaks consistency for a benefit I'm still weighing. With this change, the char type will still be a single byte, correct? What happens to foreach on strings?
Sent from my iPhone
On Dec 31, 2011, at 8:20 AM, Timon Gehr <timon.gehr at gmx.ch> wrote:
> On 12/31/2011 03:17 PM, Michel Fortin wrote:
>>
>> As for Walter being the only one coding by looking at the code units
>> directly, that's not true. All my parser code look at code units
>> directly and only decode to code points where necessary (just look at
>> the XML parsing code I posted a while ago to get an idea to how it can
>> apply to ranges). And I don't think it's because I've seen Walter code
>> before, I think it is because I know how Unicode works and I want to
>> make my parser efficient. I've done the same for a parser in C++ a while
>> ago. I can hardly imagine I'm the only one (with Walter and you). I
>> think this is how efficient algorithms dealing with Unicode should be
>> written.
>>
>
> +1.
More information about the Digitalmars-d
mailing list