Inconsitency
qznc
qznc at web.de
Wed Oct 16 05:33:00 PDT 2013
On Wednesday, 16 October 2013 at 12:18:40 UTC, Jacob Carlborg
wrote:
> On 2013-10-16 10:03, qznc wrote:
>
>> Most code might be buggy then.
>>
>> An issue the often comes up is file names. A file called "bär"
>> will be
>> normalized differently depending on the operating system. In
>> both cases
>> it is one grapheme. However, on Linux it is one code point,
>> but on OS X
>> it is two code points.
>
> Why would it require two code points?
It is either [U+00E4] as one code point or [a,U+0308] for two
code points. The second is "combining diaeresis" [0]. Not
required, but possible. Those combining characters [1] provide a
nearly infinite number of combinations. You can go crazy with it:
http://stackoverflow.com/questions/6579844/how-does-zalgo-text-work
[0] http://www.fileformat.info/info/unicode/char/0308/index.htm
[1] http://en.wikipedia.org/wiki/Combining_character
More information about the Digitalmars-d
mailing list