dxml 0.2.0 released

Mon Feb 12 15:50:12 UTC 2018

On Monday, February 12, 2018 15:26:24 rikki cattermole via Digitalmars-d-
announce wrote:
> All J.M.D. has to do to change this, is make the API match the spec (as
> close as possible, without writing another parser) and separate out the
> implementation into a different and very clear module (probably a sub
> package) which states clearly that it is a subset with the full grammar
> listed that it supports.

That literally cannot be done. dxml returns slices (or takeExactly's) of the
original input. For it to do otherwise would harm performance and usability,
but in order to implement full DTD support, it's impossible to return slices
of the original input in the general case, because you have to be able to
mutate the data whenever entity references get involved. If the API were
entirely string-based, then whether the implementation returned slices or
newly allocated strings could be an implementation detail, but as soon as
you're dealing with arbitrary ranges of characters, that doesn't work. At
that point, you're forced to either return strings for everything (which
means allocating for any ranges that aren't strings) or to return a lazy
range of characters and thus can't return the original type. And that means
that if you pass it a string, you're stuck with a lazy range out the other
end instead of a string, and to get a string again, you have to allocate,
whereas with what I have now, the parser does almost no allocations, and as
long as the input type supports slicing, you get exactly the same type out
the other end, which is a huge usabality improvement IMHO.

So, you can't have DTD support with the kind of API that dxml has, and
changing the API to something that could work with DTD support would harm
the parser for all of the cases where DTD support is unnecessary.

Even if I were going to implement full DTD support, I would do it with
another parser, not change the parser that dxml already has. And if dxml
ends up in Phobos with the parser that it has, that doesn't prevent another
parser from being added for the DTD case later if someone actually decides
to put in the time and effort to do it. Either way, for any XML document
that doesn't need DTD support, the way that dxml does things is more
efficient and user-friendly than one that had DTD support would be, much as
that obviously doesn't cut it for those documents that do need DTD support.

In any case, I'm going to finish implementing dxml without any kind of DTD
support and then see how things go as far as the Phobos review process goes.
If dxml gets rejected, because the majority of folks think that we're better
off with std.xml (or no xml parser at all in Phobos) than one that doesn't
have DTD support, then oh well. That sucks, but anyone who wants dxml can
then use it as a 3rd party library. I think that the D community would be
worse off because of that, but it's not ultimately my decision to make, and
either way, I have the parser that I need.

- Jonathan M Davis