Today's programming challenge - How's your Range-Fu ?

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Sat Apr 18 11:40:08 PDT 2015


On 4/18/2015 11:28 AM, H. S. Teoh via Digitalmars-d wrote:
> On Sat, Apr 18, 2015 at 10:50:18AM -0700, Walter Bright via Digitalmars-d wrote:
>> On 4/18/2015 4:35 AM, Jacob Carlborg wrote:
>>> \u0301 is the "combining acute accent" [1].
>>>
>>> [1] http://www.fileformat.info/info/unicode/char/0301/index.htm
>>
>> I won't deny what the spec says, but it doesn't make any sense to have
>> two different representations of eacute, and I don't know why anyone
>> would use the two code point version.
>
> Well, *somebody* has to convert it to the single code point eacute,
> whether it's the human (if the keyboard has a single key for it), or the
> code interpreting keystrokes (the user may have typed it as e +
> combining acute), or the program that generated the combination, or the
> program that receives the data.

Data entry should be handled by the driver program, not a universal interchange 
format.


> When we don't know provenance of
> incoming data, we have to assume the worst and run normalization to be
> sure that we got it right.

I'm not arguing against the existence of the Unicode standard, I'm saying I 
can't figure any justification for standardizing different encodings of the same 
thing.



More information about the Digitalmars-d mailing list