The Case Against Autodecode
tsbockman via Digitalmars-d
digitalmars-d at puremagic.com
Thu Jun 2 14:51:51 PDT 2016
On Thursday, 2 June 2016 at 21:38:02 UTC, default0 wrote:
> On Thursday, 2 June 2016 at 21:30:51 UTC, tsbockman wrote:
>> 1) It does not say that level 2 should be opt-in; it says that
>> level 2 should be toggle-able. Nowhere does it say which of
>> level 1 and 2 should be the default.
>>
>> 2) It says that working with graphemes is slower than UTF-16
>> code UNITS (level 1), but says nothing about streaming
>> decoding of code POINTS (what we have).
>>
>> 3) That document is from 2000, and its claims about
>> performance are surely extremely out-dated, anyway. Computers
>> and the Unicode standard have both changed much since then.
>
> 1) Right because a special toggleable syntax is definitely not
> "opt-in".
It is not "opt-in" unless it is toggled off by default. The only
reason it doesn't talk about toggling in the level 1 section, is
because that section is written with the assumption that many
programs will *only* support level 1.
> 2) Several people in this thread noted that working on
> graphemes is way slower (which makes sense, because its yet
> another processing you need to do after you decoded - therefore
> more work - therefore slower) than working on code points.
And working on code points is way slower than working on code
units (the actual level 1).
> 3) Not an argument - doing more work makes code slower.
What do you think I'm arguing for? It's not graphemes-by-default.
What I actually want to see: permanently deprecate the
auto-decoding range primitives. Force the user to explicitly
specify whichever of `by!dchar`, `byCodePoint`, or `byGrapheme`
their specific algorithm actually needs. Removing the implicit
conversions between `char`, `wchar`, and `dchar` would also be
nice, but isn't really necessary I think.
That would be a standards-compliant solution (one of several
possible). What we have now is non-standard, at least going by
the old version Walter linked.
More information about the Digitalmars-d
mailing list