The Case Against Autodecode

default0 via Digitalmars-d digitalmars-d at puremagic.com
Thu Jun 2 14:38:02 PDT 2016


On Thursday, 2 June 2016 at 21:30:51 UTC, tsbockman wrote:
> On Thursday, 2 June 2016 at 21:07:19 UTC, default0 wrote:
>> The level 2 support description noted that it should be opt-in 
>> because its slow.
>
> 1) It does not say that level 2 should be opt-in; it says that 
> level 2 should be toggle-able. Nowhere does it say which of 
> level 1 and 2 should be the default.
>
> 2) It says that working with graphemes is slower than UTF-16 
> code UNITS (level 1), but says nothing about streaming decoding 
> of code POINTS (what we have).
>
> 3) That document is from 2000, and its claims about performance 
> are surely extremely out-dated, anyway. Computers and the 
> Unicode standard have both changed much since then.

1) Right because a special toggleable syntax is definitely not 
"opt-in".
2) Several people in this thread noted that working on graphemes 
is way slower (which makes sense, because its yet another 
processing you need to do after you decoded - therefore more work 
- therefore slower) than working on code points.
3) Not an argument - doing more work makes code slower. The only 
thing that changes is what specific operations have what cost 
(for instance, memory access has a much higher cost now than it 
had then). Considering the way the process works and judging from 
what others in this thread have said about it, I will stick with 
"always decoding to graphemes for all operations is very slow" 
and indulge in being too lazy to write benchmarks for it to show 
just how bad it is.


More information about the Digitalmars-d mailing list