std.xml2 (collecting features) control character

Robert burner Schadek via Digitalmars-d digitalmars-d at puremagic.com
Thu Feb 18 07:56:58 PST 2016


While working on a new xml implementation I came cross "control 
characters (CC)". [1]
When trying to validate/convert an utf string these lead to 
exceptions, because they are not valid utf character.
Unfortunately, some of these characters are allowed to appear in 
valid xml 1.* documents.

I currently see two option how to go about it:

1. Do not allow non CCs that do not work with existing 
functionality.
1.Pros
   * easy
1.Cons
   * the resulting xml implementation will not be xml 1.* complete

2. Add special cases to the existing functionality to handle CCs 
that are allowed in 1.0.
2.Pros
   * the resulting xml implementation will be xml 1.* complete
2.Cons
   * will make utf de/encoding slower as I would need to add 
additional logic

Any other ideas, feedback?




[1] https://en.wikipedia.org/wiki/C0_and_C1_control_codes



More information about the Digitalmars-d mailing list