Why the hell doesn't foreach decode strings

Alvaro alvaro_segura at gmail.com
Sat Oct 22 03:55:42 PDT 2011


El 20/10/2011 21:37, Martin Nowak escribió:
> It just took me over one hour to find out the unthinkable.
> foreach(c; str) will deduce c to immutable(char) and doesn't care about
> unicode.
> Now there is so many unicode transcoding happening in the language that
> it starts to get annoying,
> but the most basic string iteration doesn't support it by default?

Maybe I didn't fully get your point, but, you do know that you can do 
the following, right?

   string str = "Ñandú";
   foreach(dchar c; str)
   ...

and it decodes full unicode characters just fine. Or maybe you are just 
talking about the auto type inference, just to make sure...

OTOH, as others say, it's not rare to iterate on 8-bit units if you're 
dealing with ASCII or if you are parsing and looking for operators and 
separators (which are normally 8-bit). Then you can leave the rest 
untouched or extract the parts in between without caring if they are 1 
byte per character or several. (e.g. parsing XML, or JSON, or CSV, or 
INI, or conf, or D source, etc.)



More information about the Digitalmars-d mailing list