The XML module in Phobos

Michel Fortin michel.fortin at michelf.com
Tue Aug 4 16:43:33 PDT 2009


On 2009-08-04 10:01:51 -0400, Michael Rynn <michaelrynn at optushome.com.au> said:

> It would be nice  to have well defined interfaces for  DOM, SAX and
> PULL parsers which share some of the base parsing code. The DOM can be
> partial,  as node sets returned from XPath query. Nice how the phobos
> parser can make a full DOM or just the bits required.

Exactly what I've been working on:

Tokenizer part: http://michelf.com/docs/d/mfr/xmltok.html
DOM part:       http://michelf.com/docs/d/mfr/xml.html

Note that it's still a work in progress. Here are some things I'd like to do:

tokenizer: add specialized exception classes to better report various 
problems, add better checks for valid characters (should be optional), 
better support for ranges (currently only string because I rely on 
"a.before(b)" to avoid dynamic allocation), also add support for the 
internal subset in the doctype (but that's low priority).

Writer: replace by a simple template function and a toString function 
defined for each token type? or a writeTo function (to avoid creating a 
intermediary string)?

XMLForwardRange: allow a template parameter specifying the token types 
you want to see, skipping all others. This could be done by passing a 
custom Algebraic type instead of the provided one what can contain all 
tokens.

DOM classes: it's mostly experimental for now.

There's no SAX yet, although it should be trivial to add over the 
existing callback tokenizer.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Digitalmars-d mailing list