The Case Against Autodecode

default0 via Digitalmars-d digitalmars-d at puremagic.com
Tue May 31 00:17:03 PDT 2016


On Tuesday, 31 May 2016 at 06:45:56 UTC, H. S. Teoh wrote:
> On Tue, May 31, 2016 at 12:13:57AM -0400, Andrei Alexandrescu 
> via Digitalmars-d wrote:
>> On 5/30/16 6:00 PM, Walter Bright wrote:
>> > On 5/30/2016 11:25 AM, Adam D. Ruppe wrote:
>> > > I don't agree on changing those. Indexing and slicing a 
>> > > char[] is really useful and actually not hard to do 
>> > > correctly (at least with regard to handling code units).
>> > 
>> > Yup. It isn't hard at all to use arrays of codeunits 
>> > correctly.
>> 
>> Trouble is, it isn't hard at all to use arrays of codeunits 
>> incorrectly, too. -- Andrei
>
> Neither does autodecoding make code anymore correct. It just 
> better hides the fact that the code is wrong.
>
>
> T

Thinking about this a bit more - what algorithms are actually 
correct when implemented on the level of code units?
Off the top of my head I can only really think of copying and 
hashing, since you want to do that on the byte level anyways.
I would also think that if you know your strings are normalized 
in the same normalization form (for example because they come 
from the same normalized source), you can check two strings for 
equality on the code unit level, but my understanding of unicode 
is still quite lacking, so I'm not sure on that.


More information about the Digitalmars-d mailing list