Replacing std.xml

Jonathan M Davis jmdavisProg at gmx.com
Thu Aug 29 02:23:06 PDT 2013


On Thursday, August 29, 2013 11:08:18 Jacob Carlborg wrote:
> On 2013-08-29 09:47, Jonathan M Davis wrote:
> > Personally, I would have just said use ranges of dchar and be done with it
> > without worrying about character encodings at all, but I don't remember
> > what all the XML standard does with encodings.
> 
> Won't that have the same problem as we talked about in of the threads
> about a D lexer? That is, doing unnecessary en/decoding.

Possibly, but then all you have to do is make it so that it treats strings as 
ranges of code units (and possibly support ranges of char and wchar), and you 
can avoid the unnecessary decoding. But aside from possibly support ranges of 
char or wchar, that would be completely internal to the parser, and the caller 
wouldn't care. An alternative would be to specifically support ranges of ubyte 
instead of strings, though given that XML is usually treated as a string, that 
would arguably be a bit odd. Regardless, as far as strings go, it's easy 
enough to avoid decoding in the implementation. IIRC, everything in XML is 
ASCII anyway, with stuff like HTML codes to indicate Unicode characters. And if 
that's the case, avoiding unnecessary decoding is trivial when operating on 
strings.

- Jonathan M Davis


More information about the Digitalmars-d mailing list