The XML module in Phobos

Daniel Keep daniel.keep.lists at gmail.com
Thu Jul 30 21:16:43 PDT 2009



> Andrei Alexandrescu wrote:
>> It would be great if you could contribute to Phobos. Two things I hope
>> from any replacement (a) works with ranges and ideally outputs ranges,
>> (b) uses alias functions instead of delegates if necessary.

There's really only one sane way to map XML parsing to ranges: pull
parsing, which is more or less already a range.  For those unfamiliar
with it, this is how you use Tango's pull parser right now:

auto pp = new PullParser!(char)(xmlSource);

for( auto tt = pp.next; tt != XmlTokenType.Done; tt = pp.next )
{
    switch( tt )
    {
        case XmlTokenType.Attribute: ... break;
        case XmlTokenType.CData: ... break;
        case XmlTokenType.Comment: ... break;
        case XmlTokenType.Data: ... break;
        ...
        case XmlTokenType.StartElement: ... break;
        default: assert(false, "wtf?");
    }
}

This would fairly naturally map to a range of parsing events and look
something like:

foreach( event ; new PullParser!(char)(xmlSource) )
{
    switch( event.type )
    {
        /* again with the cases */
    }
}

Of course, most people HATE this method because it requires you to write
mountains of boilerplate code.  Pity, then, it's also the fastest and
most flexible.  :P  (It's a pity D doesn't have extension methods since
then you could probably do something along the lines of LINQ to make the
whole thing utterly painless... but then, I've given up on waiting for
that.)

This is basically the only way to map xml parsing to ranges.  As for
CONSUMING ranges, I think that'd be a bad idea for the same reason
basing IO entirely on ranges is a bad idea.

The only other use for ranges I can think of is one already mentioned by
Benji: traversal of a DOM.  Ranges don't apply to SAX because that's
what pull parsing is. :D

To Andrei: I sometimes worry that your... enthusiasm for ranges is going
to leave us with range-based APIs that don't make any sense or are
horribly slow (IO in particular has me worried).  But then, I suppose
that also makes you the perfect person to figure out where they CAN be used.

Plus, that way it's your fault if it doesn't work out.  :P



More information about the Digitalmars-d mailing list