[due diligence] std.xml
sybrandy
sybrandy at gmail.com
Tue Oct 19 13:43:04 PDT 2010
> Well one obvious problem is you have to read the document into memory
> first, which clearly isn't good enough for large documents.
I think that depends on the type of XML library we create. A SAX
library doesn't require the whole document in memory, however a DOM
library typically does as, from what I can tell, they create an
in-memory representation that's tree-like. If you don't read it into
memory, I'm not really sure how you would be able to, for example, write
XPath queries to access some random nodes that are not grouped together
in a relatively efficient manner. I say relatively because yes, the
memory layout can be very scattered, however it's still better than
having to perform random access from disk.
I guess one question we need to ask is what do we expect from this
library? Do we want a full DOM implementation or is a SAX parser good
enough? Or do we need something in between? In PHP or Perl, perhaps
both, I saw a library where an XML document was essentially transformed
into nested associative arrays. It made it very easy to read data from
the XML, however I don't know how much of the official standards it
complied with.
The current std.xml looks like it tries to be both a DOM library and a
SAX library. Personally, I'd rather break them up into two libraries,
though it may make sense for the DOM library to leverage the SAX library
to build up it's objects.
IMHO, I love a good SAX parser. I've used them in the past and I think
they work great, so having one in D I think would be ideal, especially
in those situations where the XML file is essentially read-only.
Do we need a DOM parser? I honestly don't know. Personally, I'd be
happy with the associative array approach as it's simple. I don't need
to learn a new API just to navigate through XML. Yes, I know there are
advantages to using the DOM and XPath, which I also like, but for the
most part, I don't need either.
Of course, I personally would love to just let XML die and use better
data formats, but that's an unrealistic dream :)
Casey
More information about the Digitalmars-d
mailing list