Range functions expand char to dchar

anonymous via Digitalmars-d digitalmars-d at puremagic.com
Tue Sep 8 11:21:33 PDT 2015


On Tuesday 08 September 2015 19:52, Matt Kline wrote:

> An example:
> 
> import std.algorithm;
> import std.range;
> import std.stdio;
> import std.regex;
> 
> void main()
> {
>      // One would expect this to be a range of chars
>      auto test = chain("foo", "bar", "baz");
>      // prints "dchar"
>      writeln(typeid(typeof(test.front)));
> 
>      auto arr = ["foo", "bar", "baz"];
>      auto joined = joiner(arr, ", ");
>      // Also "dchar"
>      writeln(typeid(typeof(joined.front)));
> 
>      // Problems ensue if one assumes the result of joined is a 
> char string.
>      auto r = regex(joined);
>      matchFirst("won't compile", r); // Compiler error
> }
> 
> Whether by design or by oversight,

By design with regrets:
http://forum.dlang.org/post/m01r3d$1frl$1@digitalmars.com

> this is quite undesirable. It 
> violates the principle of least astonishment (one wouldn't expect 
> joining a bunch of strings would result in a dstring),

The result is a range of dchars actually, strictly not a dstring.

> causing 
> issues such as the one shown above. And, if I aim to use UTF-8 
> consistently throughout my applications (see 
> http://utf8everywhere.org/), what am I to do?

You can use std.utf.byCodeUnit to get ranges of chars:

----
import std.algorithm;
import std.array: array;
import std.range;
import std.stdio;
import std.regex;
import std.utf: byCodeUnit;

void main()
{
    auto test = chain("foo".byCodeUnit, "bar".byCodeUnit, "baz".byCodeUnit);
    pragma(msg, typeof(test.front)); /* "immutable(char)" */

    auto arr = ["foo".byCodeUnit, "bar".byCodeUnit, "baz".byCodeUnit];
    auto joined = joiner(arr, ", ".byCodeUnit);
    pragma(msg, typeof(joined.front)); /* "immutable(char)" */

    /* Having char elements isn't enough. Need to turn the range into an
    array via std.array.array: */
    auto r = regex(joined.array);
    matchFirst("won't compile", r); /* compiles */
}
----

Alternatively, since you have to materialize `joined` into an array anyway, 
you can use the dchar range and make a string from it when passing to 
`regex`:

----
import std.algorithm;
import std.conv: to;
import std.stdio;
import std.regex;

void main()
{
    auto arr = ["foo", "bar", "baz"];
    auto joined = joiner(arr, ", ");
    pragma(msg, typeof(joined.front)); /* "dchar" */

    /* to!string now: */
    auto r = regex(joined.to!string);
    matchFirst("won't compile", r); /* compiles */
}
----


More information about the Digitalmars-d mailing list