To Walter, about char[] initialization by FF
Unknown W. Brackets
unknown at simplemachines.org
Sun Jul 30 09:38:08 PDT 2006
It's true that in HTML, attribute names were limited to a subset of
characters available for use in the document. Namely, as mentioned,
alpha-type characters (/[A-Za-z][A-Za-z0-9\.\-]*/.) You couldn't even
use accented chars.
However (in the case of HTML), you were required to use specific
(English) attribute names anyway for HTML to validate; it's really not a
significant limitation. Few people used SGML for anything else.
XML allows for Unicode attribute and element names... PIs, CDATA,
PCDATA, etc. And, of course, allows you to reference any Unicode code
point (e.g. Ӓ.)
We could also talk about the limitations of horse driven carriages, and
how they can only go a certain speed... nonetheless, we have cars now,
so I'm not terribly worried about HTML's technical limitations anymore.
-[Unknown]
>> Consider this: attribute names in html (sgml) represented by
>> ascii codes only - you don't need utf-8 processing to deal with them
>> at all.
>> You also cannot use utf-8 for storing attribute values generally
>> speaking.
>> Attribute values participate in CSS selector analysis and some selectors
>> require char by char (char as a code point and not a D char) access.
>
> I'd be surprised at that, since UTF-8 is a documented, supported HTML
> page encoding method. But if UTF-8 doesn't work for you, you can use
> wchar (UTF-16) or dchar (UTF-32), or ubyte (for anything else).
>
More information about the Digitalmars-d
mailing list