Replacing std.xml
Jonathan M Davis
jmdavisProg at gmx.com
Thu Aug 29 00:47:17 PDT 2013
On Thursday, August 29, 2013 09:25:35 w0rp wrote:
> Hello everybody. I've been wondering, what are the current plans
> to replace std.xml? I'd like to help with the effort to get a
> final XML library in phobos. So, I have a few questions.
Someone needs to step forward, write it, and get it through the review
process. A while back, someone was working on a possible new version of
std.xml, but they disappeared. No one has stepped up since. I'd love to do it
if I had time, but I don't. There are probably several others around here in
the same boat, but until someone who has the time and skill does do it, we
won't have a new std.xml.
> First, and most importantly, what do we except out of a D XML
> library? I'd really like to have a discussion of the form, "Here
> is exactly the interface the structs/classes need to implement,
> go forth and implement."
Except that that's really the task of the person creating the new std.xml.
Generally what happens is that the person writing the module comes up with an
API and then presents it rather than asking others to come up with ideas to
design it for them. Obviously, ideas can be discussed, but design-by-committee
is arguably a bad idea. And it just works better to have a concrete design to
discuss.
> The general idea in my mind is
> "something SAX-like, with something a little DOM-like."
What I personally think would be best is to have multiple parsers. First you
have something STAX-like (or maybe even lower level - I don't recall exactly
what STAX gives you at the moment) that basically tokenizes the XML and
returns a range of that. Then SAX and DOM parsers can be built on top of that.
That way, you get the fastest parser possible as well as higher level, more
functional parsers.
But two of the biggest points of the design are that it's going to have to be
range-based, and it's going to need to be able to take full advantage of
slices (when used with any strings or random-access ranges) in order to avoid
copying any of the data. That's the key design point which will allow a D
parser to be extremely fast in comparison to parsers in most other languages.
> I'm aware
> that std.xml has some issues support different encodings, so
> obvious that's included.
Personally, I would have just said use ranges of dchar and be done with it
without worrying about character encodings at all, but I don't remember what
all the XML standard does with encodings.
> Second, is there an existing library that has gotten close to
> meeting whatever we need for the first point? If so, how far away
> is it from being able to meet all of the requirements and become
> the standard library version?
There are several D XML libraries floating around, but no one has taken the
time to get any of the prepared for the Phobos review queue, and I suspect
that very few of them are range-based like the Phobos XML solution needs to
be, but I don't know.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list