The XML module in Phobos
Michel Fortin
michel.fortin at michelf.com
Fri Jul 31 05:48:48 PDT 2009
On 2009-07-30 22:42:29 -0400, Benji Smith <dlanguage at benjismith.net> said:
>> Michael Rynn wrote:
>>> I did look at the code for the xml module, and posted a suggested bug
>>> fix to the empty elements problem. I do not have access rights to
>>> updating the source repository, and at the time was too busy for this.
>
> Andrei Alexandrescu wrote:
>> It would be great if you could contribute to Phobos. Two things I hope
>> from any replacement (a) works with ranges and ideally outputs ranges,
>> (b) uses alias functions instead of delegates if necessary.
>
> Interesting. Most XML parsers either produce a "Document" object, or
> they just execute SAX callbacks. If an XML parser returned a range
> object, how would you use it?
>
> Usually, I use something like XPath to extract information from an XML
> doc. Something liek this:
>
> auto doc = parser.parse(xml);
> auto nodes = doc.select("/root//whatever[0][@id]");
>
> I can see how you might do depth-first or breadth-first traversal of
> the DOM tree, or inorder traversal of the SAX events, with a range. But
> that's now how most people use XML. Are there are other range tricks up
> your sleeve that would support the a DOM or XPath kind of model?
A range is mostly a list of things. In the example above, doc.select
could return a range to lazily evaluate the query instead of computing
the whole query and returning all the elements. This way, if you only
care about the first result you just take the first and don't have to
compute them all.
Ranges can be used everywehere there are lists, and are especially
useful for lazy lists that compute things as you go. I made an XML
tokenizer (similar to Tango's pull parser) with a range API. Basically,
you iterate over various kinds of token made available through an
Algebraic, and as you advance it parses the document to get you the
next token. (It'd be more useful if you could switch on various kinds
of tokens with an Algebraic -- right now you need to use "if
(token.peek!OpenElementToken)" -- but that's a problem with Algebraic
that should get fixed I believe, or else I'll have to use something
else.)
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list