Proposal for fixing dchar ranges
monarch_dodra
monarchdodra at gmail.com
Wed Mar 12 01:47:56 PDT 2014
On Tuesday, 11 March 2014 at 18:02:26 UTC, Steven Schveighoffer
wrote:
> No, where we are today is that in some cases, the language
> treats a char[] as an array of char, in other cases, it treats
> a char[] as a bi-directional dchar range.
>
> -Steve
I want to mention something I've had trouble with recently, that
I haven't seen mentioned yet, but is related:
The ambiguity of the "lone char".
By that I mean: When a function accepts 'char' as an argument, it
is (IMO) very hard to know if it is actually accepting a?
1. An ascii char in the 0 .. 128 range?
2. A code unit?
3. (heaven forbid) a codepoint in the 0 .. 256 range packed into
a char?
Currently (fortuantly? unfortunatly?) the current choice taken in
our algorithms is 3, which is actually the 'safest' solution.
So if you write:
find("cassé", cast(char)'é');
It *will* correctly find the 'é', but it *won't* search for it in
individual codeunits.
--------
Another more pernicious case is that of output ranges. "put" is
supposed to know how to convert and string/char width, into any
sting/char width.
Again, things become funky if you tell "put" to place a string,
into a sink that accepts a char.
Is the sink actually telling you to feed it code units? or ascii?
More information about the Digitalmars-d
mailing list