ElementType!string

Jakob Ovrum jakobovrum at gmail.com
Sun Aug 25 12:56:33 PDT 2013


On Sunday, 25 August 2013 at 19:25:08 UTC, qznc wrote:
> Apparently, ElementType!string evaluates to dchar. I would have 
> expected char. Why is that?

It is mentioned in the documentation of `ElementType`. Use 
`std.range.ElementEncodingType` or `std.traits.ForeachType` to 
get `char` and `wchar` when given arrays of those two types.

As for the rationale:

`string`, being an alias for `immutable(char)[]`, is an array of 
UTF-8 code units - an array of `char`s. However, it is indeed a 
forward range of code points (represented as a UTF-32 code unit - 
`dchar`). It's a (slightly controversial) choice that was made to 
make Unicode-correct code the easiest and most intuitive to 
write, as code points are much more useful than code units.

Note that it is not a random-access range. UTF-8 is a variable 
length encoding, so several code units can be required to encode 
a single code point. Hence, a non-trivial search is required to 
get the n'th code point in a UTF-8 or UTF-16 string.

Another name for a code point is "character" (technically, a 
character is what the code point translates to in the UCS). 
However, it can be a deceptive name - the units we see on screen 
when rendered are "graphemes", as Unicode characters can be 
combining, zero-width etc.

To get a range of UTF-8 or UTF-16 code units, the code units have 
to be represented as something other than `char` and `wchar`. For 
example, you can cast your string to immutable(ubyte)[] to 
operate on that, then cast it back at a later point.


More information about the Digitalmars-d-learn mailing list