Why the hell doesn't foreach decode strings

Jonathan M Davis jmdavisProg at gmx.com
Thu Oct 20 12:58:20 PDT 2011


On Thursday, October 20, 2011 21:37:56 Martin Nowak wrote:
> It just took me over one hour to find out the unthinkable.
> foreach(c; str) will deduce c to immutable(char) and doesn't care about
> unicode.
> Now there is so many unicode transcoding happening in the language that it
> starts to get annoying,
> but the most basic string iteration doesn't support it by default?

Walter won't change it, because it would silently change too much code. Now, 
I'm willing to bet that in 99.9999999% of cases, it would _fix_ the code rather 
than break it, but still, he won't do it. However, the behavior _is_ 
completely consistent with the rest of the language, since it's the range-
based stuff which decodes arrays of chars or wchars as characters. And it 
_would_ be inconsistent with all other uses of foreach for arrays of char or 
wchar to be iterated over as ranges of dchar. But still, it's a bug waiting to 
happen which doesn't really benefit anyone.

I've suggested that there should be a warning when code uses a foreach over an 
array of char or wchar without specifying the iteration type ( 
http://d.puremagic.com/issues/show_bug.cgi?id=4483 ). That way, you can 
specify char or wchar if you really want it, but anyone who forgets to 
explicitly use dchar (or doesn't realize that they should) is warned. But that 
hasn't been implemented as of yet, and I don't believe that Walter has voiced 
his opinion on it.

- Jonathan M Davis


More information about the Digitalmars-d mailing list