typeof(string.front) should be char
Jonathan M Davis
jmdavisProg at gmx.com
Sat Mar 3 12:42:53 PST 2012
On Saturday, March 03, 2012 21:05:40 Timon Gehr wrote:
> On 03/03/2012 08:46 PM, Jonathan M Davis wrote:
> > On Saturday, March 03, 2012 18:38:44 Timon Gehr wrote:
> >> On 03/03/2012 09:40 AM, Jonathan M Davis wrote:
> >>> ... but operating on
> >>> code points is _far_ more correct than operating on code units. It's
> >>> also
> >>> more efficient.
> >>> [snip.]
> >>
> >> No, it is less efficient.
> >
> > Operating on code points is more efficient than operating on graphemes is
> > what I meant. I can see that I wasn't clear enough on that.
>
> Makes sense.
>
> > It's more correct than operating on code units and less correct than
> > operating on graphemes,while it's less efficient than operating on code
> > units and more efficient than operating on graphemes.
> >
> > - Jonathan M Davis
>
> When the code actually only cares about some characters that have 7-bit
> ASCII values, most of the time there are no correctness issues when
> operating on code units directly.
True, but writing code without caring about unicode frequently leads to bugs
when you actually _do_ have to deal with unicode (the fact that an American
programmer runs into unicode less just makes it worse, because they're less
likely to catch their bugs), and char is UTF-8 by definition.
So, operating specifically on ASCII is an optimization and should be coded for
specifically rather than being generally encouraged. And having ranges over
strings be code units rather than code points would encourage incorrect usage.
The current solution encourages correct usage (or at least usage which is
closer to correct, since it still isn't at the grapheme level) without
disallowing more optimized code.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list