unicode combinig mark/ std.uni question
ikod
geller.garry at gmail.com
Tue Dec 5 20:04:29 UTC 2017
Hello,
I have to create very basic IDNA (Internationalized Domain Names
in Applications) library. There are two parts in IDNA - user
input checks and punycode encoding/decoding.
Punycode part already completed, and now I have to add some
checks but I'm weak in unicode and cant find proper way to
express these tests using std.uni.
Here are list of prohibited domain labels
(https://tools.ietf.org/html/rfc5891):
o Labels whose first character is a combining mark (see The
Unicode
Standard, Section 2.11 [Unicode]).
o Labels containing prohibited code points, i.e., those that
are
assigned to the "DISALLOWED" category of the Tables document
[RFC5892].
o Labels containing code points that are identified in the
Tables
document as "CONTEXTJ", i.e., requiring exceptional
contextual
rule processing on lookup, but that do not conform to those
rules.
Note that this implies that a rule must be defined, not
null: a
character that requires a contextual rule but for which the
rule
is null is treated in this step as having failed to conform
to the
rule.
o Labels containing code points that are identified in the
Tables
document as "CONTEXTO", but for which no such rule appears
in the
table of rules. Applications resolving DNS names or
carrying out
equivalent operations are not required to test contextual
rules
for "CONTEXTO" characters, only to verify that a rule is
defined
(although they MAY make such tests to provide better
protection or
give better information to the user).
o Labels containing code points that are unassigned in the
version
of Unicode being used by the application, i.e., in the
UNASSIGNED
category of the Tables document.
Can anybody help with this task?
Thanks!
More information about the Digitalmars-d
mailing list