Phobos Proposal: replace std.xml with kxml.
Michel Fortin
michel.fortin at michelf.com
Tue May 4 14:56:33 PDT 2010
On 2010-05-04 12:09:29 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> said:
> Graham Fawcett wrote:
>> By "adapt" do you mean writing a wrapper for an existing library, or
>> translating the source code of the library into D?
>> What constitutes a "generous license" in this context? (For what it's
>> worth, libxml2 is under the MIT License.)
>>
>> Graham
>
> We'd need to modify the code. I haven't looked into available xml
> libraries so I don't know which would be eligible.
I think if you wanted to port an XML library to make use of ranges, the
only viable option is probably to find one based on C++ iterators.
Otherwise it'll look more like a rewrite than a port, and at this point
why not write one from scratch?
Anyway, just in case, would you be interested in an XML tokenizer and
simple DOM following this model?
http://michelf.com/docs/d/mfr/xmltok.html
http://michelf.com/docs/d/mfr/xml.html
At the base is a pull parser and an event parser mixed in the same
function template: "tokenize", allowing you to alternate between
even-based and pull-parsing at will. I'm using it, but its development
is on hold at this time, I'm just maintaining it so it compiles on the
newest versions of DMD.
The only thing it doesn't parse at this time is inline DTDs inside the doctype.
Also, it currently only works only with strings, for simplicity and
performance. There is one issue about non-string parsing: when parsing
a string, it's easy to just slice the string and move it around, but if
you're parsing from a generic input range, you basically have to copy
characters one by one, which is much less efficient. So ideally the
algorithm should use slices whenever it can (when the input is a
string).
I'm not sure yet how to attack this problem, but I'm thinking that
perhaps parsing primitives should be "part of" the range interface. I
say this in the sense that a range should provide specialized
implementation of primitive when it can implement them more efficiently
(like by slicing). You wrote a while ago about designing parsing
primitives, is this part of Phobos now?
Anyway, the problem above is probably the one reason we might want to
write the parser from scratch: it needs to bind to specializable
higher-level parsing functions to take advantage of the performance
characteristics of certain ranges, such as those you can slice.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list