Why the hell doesn't foreach decode strings

Jonathan M Davis jmdavisProg at gmx.com
Fri Oct 21 11:38:39 PDT 2011


On Friday, October 21, 2011 11:11 Peter Alexander wrote:
> On 21/10/11 3:26 AM, Walter Bright wrote:
> > On 10/20/2011 2:49 PM, Peter Alexander wrote:
> >> The whole mess is caused by conflating the idea of an array with a
> >> variable
> >> length encoding that happens to use an array for storage. I don't
> >> believe there
> >> is any clean and tidy way to fix the problem without breaking
> >> compatibility.
> > 
> > There is no 'fixing' it, even to break compatibility. Sometimes you want
> > to look at an array of utf8 as 8 bit characters, and sometimes as 20 bit
> > dchars. Someone will be dissatisfied no matter what.
> 
> Then separate those ways of viewing strings.
> 
> Here's one solution that I believe would satisfy everyone:
> 
> 1. Remove the string, wstring and dstring aliases. An array of char
> should be an array of char, i.e. the same as array of byte. Same for
> arrays of wchar and dchar. This way, arrays of T have no subtle
> differences for certain kinds of T.
> 
> 2. Add string, wstring and dstring structs with the following interface:
> 
> a. foreach should iterate as dchar.
> b. @property front() would be dchar.
> c. @property length() would not exist.
> d. @property buffer() returns the underlying immutable array of char,
> wchar etc.
> e. Remove opIndex and co.
> 
> What this does:
> - Makes all array types consistent and intuitive.
> - Makes looping over strings do the expected thing.
> - Provides an interface to the underlying 8-bit chars for those that
> want it.
> 
> 
> Of course, people will still need to understand UTF-8. I don't think
> that's a problem. It's unreasonable to expect the language to do the
> thinking for you. The problem is that we have people that *do*
> understand UTF-8 (like the OP), but *don't* understand D's strings.

In another post in this thread, Walter said in reference to post on 
essentially this idea: "Making such a string type would be terribly 
inefficient. It would make D completely uncompetitive for processing strings." 
Now, whether that's true is debatable, but that's his stance on the idea.

- Jonathan M Davis


More information about the Digitalmars-d mailing list