dxml 0.2.0 released
Jonathan M Davis
newsgroup.d at jmdavisprog.com
Tue Feb 13 23:44:51 UTC 2018
On Tuesday, February 13, 2018 14:13:36 H. S. Teoh via Digitalmars-d-announce
wrote:
> Great, just
> great. Now I know why I've always had this gut feeling that
> *something* is off about the whole XML mania.)
Well, there are plenty of folks who talk like XML is a pile of steaming muck
that should never be used (and then usually talk about how great JSON is). I
think that basic XML is actually pretty okay - basically the subset that
dxml supports, though if I were designing XML I'd take it a bit further.
Personally, I'd make XML documents completely recursive - meaning that the
top level is the same as any deeper level, so you could have as many element
tags at the top level as you want and as much text as you want, whereas XML
requires a root element and only allows stuff like processing instructions,
comments, and the DOCTYPE stuff outside of the root element.
I'd get rid of the <?xml...?> and <!DOCTYPE...> declarations as well as
processing instructions, and I'd probably get rid of the CDATA section in
favor of escaping characters with backslashes like you typically do in
strings (or in JSON), and related to that, I'd get rid of the predefined
entity references, making stuff like & legal. I also might get rid of empty
element tags becase they're annoying to deal with when parsing, but they do
reduce the verbosity of the document such that they might be worth keeping.
It's also tempting to get rid of the tag name on end tags, which would
actually make parsing much easier, but having them helps the legibility of
XML documents, and it's a bit like semicolons in D in the sense that they
can help ensure that error messages refer to the right thing rather than
something later in the document, so I don't know. I'd also allow all Unicode
characters instead of disallowing a number of them, since it won't really
matter for most documents, and then the parser doesn't need to care about
them when validating.
So, basically, you end up with start tags, end tags, and comments, with
start tags optionally having attributes. backslashes would then be used for
escaping stuff, and you end up with something pretty dead simple.
However, as you're finding out when reading through the XML spec, the folks
who created XML didn't think like that at all, and were clearly coming from
a _very_ different point of view as to what an XML document was for and
should contain. But as you might imagine, given my take on what XML should
have been, finding out in detail what XML actually _is_ was pretty
horrifying.
I started dxml with the intention of fully implementing all aspects of the
spec but ultimately decided that it simply wasn't worth it.
- Jonathan M Davis
More information about the Digitalmars-d-announce
mailing list