The Case Against Autodecode

Marco Leise via Digitalmars-d digitalmars-d at puremagic.com
Mon May 30 15:11:00 PDT 2016


Am Fri, 27 May 2016 15:47:32 +0200
schrieb ag0aep6g <anonymous at example.com>:

> On 05/27/2016 03:32 PM, Andrei Alexandrescu wrote:
> >>> However the following do require autodecoding:
> >>>
> >>> s.walkLength
> >>> s.count!(c => !"!()-;:,.?".canFind(c)) // non-punctuation
> >>> s.count!(c => c >= 32) // non-control characters
> >>>
> >>> Currently the standard library operates at code point level even
> >>> though inside it may choose to use code units when admissible. Leaving
> >>> such a decision to the library seems like a wise thing to do.  
> >>
> >> But how is the user supposed to know without being a core contributor to
> >> Phobos?  
> >
> > Misunderstanding. All examples work properly today because of
> > autodecoding. -- Andrei  
> 
> They only work "properly" if you define "properly" as "in terms of code 
> points". But working in terms of code points is usually wrong. If you 
> want to count "characters", you need to work with graphemes.
> 
> https://dpaste.dzfl.pl/817dec505fd2

1: Auto-decoding shall ALWAYS do the proper thing
2: Therefor humans shall read text in units of code points
3: OS X is an anomaly and must be purged from this planet
4: Indonesians shall be converted to a sane alphabet
5: He who useth combining diacritics shall burn in hell
6: We shall live in peace and harmony forevermore
Let's give this a rest.

-- 
Marco



More information about the Digitalmars-d mailing list