[Issue 4483] foreach over string or wstring, where element type not specified, does not support unicode

Tue Jan 21 23:43:01 PST 2014

https://d.puremagic.com/issues/show_bug.cgi?id=4483

--- Comment #7 from monarchdodra at gmail.com 2014-01-21 23:42:52 PST ---
(In reply to comment #5)
> (In reply to comment #4)
> Sounds pretty hacky.
> 
> Andrei posted a good alternative solution.
> http://forum.dlang.org/post/j7soe4$2rvt$1@digitalmars.com
> 
> We should also try to use range forech (.front, .popFront, .empty) because
> decoding is much faster than with the runtime implementation which uses a
> delegate.

Arguably, that's an implementation problem? In particular, front/popFront is
*know* to be a slow iterative process, due to the double decode+stride.

Whenever I need *real* speed for string decoding, *NOTHING* beats
std.utf.decode. And technically, I see no reason the built-in foreach couldn't
*hope* to one day achieve the same speeds.

> After a proper deprecation cycle this could work like so.
> 
>     // iterate over any unicode string
>     foreach (c; "foobar") {}  //dchar
>     foreach (c; "foobar"c) {} //dchar
>     foreach (c; "foobar"w) {} //dchar
>     foreach (c; "foobar"d) {} //dchar
> 
>     // iterate over representation
>     foreach (c; "foobar".rep) {}  //ubyte
>     foreach (c; "foobar"c.rep) {} //ubyte
>     foreach (c; "foobar"w.rep) {} //ushort
>     foreach (c; "foobar"d.rep) {} //uint

Sounds good to me.

>     // conversion becomes an error
>     foreach (char c; "foobar"c) {}   //error can't convert dchar to char
>     foreach (char c; "foobar"w) {}   //error can't convert dchar to char
>     foreach (char c; "foobar"d) {}   //error can't convert dchar to char
>     foreach (wchar wc; "foobar"c) {} //error can't convert dchar to wchar
>     foreach (wchar wc; "foobar"w) {} //error can't convert dchar to wchar
>     foreach (wchar wc; "foobar"d) {} //error can't convert dchar to wchar
> 
>     // std.utf.transcode for transcoding (lazy iteration)
>     foreach (c; "foobar".transcode!char()) {}   //char
>     foreach (wc; "foobar".transcode!wchar()) {} //wchar
>     // and so on and so forth
> 
>     // further changes
>     "foobar".length // use "foobar".rep.length
>     "foobar"[0]     // use "foobar".rep[0]

Now that just seems excessive to me. Especially the last one. We can't
completely act like a string isn't an array. Plus:

auto c = "foobar"[0];
static assert(is(typeof(c) == char); //Fails

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------