Why does enumerate over range return dchar, when ranging without returns char?
Jonathan M Davis
newsgroup.d at jmdavisprog.com
Thu May 3 10:09:32 UTC 2018
On Thursday, May 03, 2018 22:00:04 rikki cattermole via Digitalmars-d-learn
wrote:
> On 03/05/2018 9:50 PM, ag0aep6g wrote:
> > On 05/03/2018 07:56 AM, rikki cattermole wrote:
> >>> ```
> >>> import std.stdio;
> >>> import std.range : enumerate;
> >>>
> >>> void main()
> >>> {
> >>> char[] s = ['a','b','c'];
> >>>
> >>> char[3] x;
> >>> auto i = 0;
> >>> foreach(c; s) {
> >>> x[i] = c;
> >>> i++;
> >>> }
> >>>
> >>> writeln(x);
> >>> }
> >>> ```
> >>> Above works without cast.
> >>>
> >>> '''
> >>> import std.stdio;
> >>> import std.range : enumerate;
> >>>
> >>> void main()
> >>> {
> >>> char[] s = ['a','b','c'];
> >>>
> >>> char[3] x;
> >>> foreach(i, c; enumerate(s)) {
> >>> x[i] = c;
> >>> i++;
> >>> }
> >>>
> >>> writeln(x);
> >>> }
> >>> ```
> >
> > [...]
> >
> >> The first example uses auto-decoding (UTF-8 codepoints into a single
> >> UTF-32 one). This is considered a bad thing. But the compiler can
> >> disable it and leave it as UTF-8 code point upon request.
> >
> > The first example (foreach over a char[]) doesn't do any decoding. UTF-8
> > stays UTF-8.
> >
> > Also, a `char` is a UTF-8 code *unit*, not a code *point*.
> >
> >> The second example returns a Voldemort type (means no-name) which
> >> happens to be an input range. Where it can't disable anything and has
> >> been told that it is returning a dchar. See[0] as to where this gets
> >> decoded.
> >
> > This is auto decoding.
> >
> >> Writing two small functions to replace it (and popFront), will
> >> override this behavior.
> >
> > This sounds like you can disable auto decoding by providing your own
> > range primitives in your own module. That doesn't work, because Phobos
> > would still use the ones from std.range.primitives.
>
> Hmm, I swear this use to work.
>
> Oh well, easy fix:
>
> import std.algorithm;
>
> struct Wrapper {
> char[] input;
> alias input this;
>
> @property char front() { return input[0]; }
> @property bool empty() {return input.length == 0;}
> void popFront() { input = input[1 .. $]; }
> }
>
> void main() {
> char[] text = ['1', '2', '3'];
>
> foreach(c; Wrapper(text).filter!(a => a != '\0')) {
> pragma(msg, typeof(c));
> }
> }
The standard way to get around auto-decoding is std.utf.byCodeUnit.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list