Why the hell doesn't foreach decode strings

Jonathan M Davis jmdavisProg at gmx.com
Thu Oct 20 19:37:34 PDT 2011


On Thursday, October 20, 2011 19:26:43 Walter Bright wrote:
> On 10/20/2011 2:49 PM, Peter Alexander wrote:
> > The whole mess is caused by conflating the idea of an array with a
> > variable length encoding that happens to use an array for storage. I
> > don't believe there is any clean and tidy way to fix the problem
> > without breaking compatibility.
> There is no 'fixing' it, even to break compatibility. Sometimes you want to
> look at an array of utf8 as 8 bit characters, and sometimes as 20 bit
> dchars. Someone will be dissatisfied no matter what.
> 
> There is no way to program strings in D without being aware of UTF-8
> encoding.

True, but if the default were dchar, then the common case would be have fewer 
bugs (still allowing you to explicitly use char or wchar when you want to). At 
minimum, I think that it would be a good idea to implement 
http://d.puremagic.com/issues/show_bug.cgi?id=6652 and make it a warning not 
to explicitly give the type with foreach for arrays of char or wchar. It would 
catch bugs without changing the behavior of any existing code, and it still 
allows you to iterate over either code units or code points.

- Jonathan M Davis


More information about the Digitalmars-d mailing list