dxml 0.2.0 released

Jonathan M Davis newsgroup.d at jmdavisprog.com
Tue Feb 13 23:53:03 UTC 2018


On Tuesday, February 13, 2018 14:29:27 H. S. Teoh via Digitalmars-d-announce 
wrote:
> Given the insane complexities of DTD that I'm only slowly beginning to
> grasp from actually reading the spec, I'm quickly adopting the opinion
> that dxml should remain as-is, and any DTD implementation should be
> layered on top.  The only potential changes that might be needed is:
>
> - provide a way to parse XML snippets that don't have a <?xml ...>
>   declaration, so that a DTD implementation could, for example, hand an
>   entity body over to dxml to extract any tags that may be nested in
>   there (and if my reading of section 4.3.2 is correct, all such tags
>   must always be closed inside the entity body, so there should be no
>   errors produced).

XML 1.0 does not require the <?xml...?> section - which is the main reason
why dxml implements XML 1.0 and not 1.1. When working on one of my projects
with std_experimental_xml, I had to keep adding the <?xml...?> declaration
to the start of XML snippets in all of my tests which had to deal with
sections of an XML document, and it was _really_ annoying.

dxml does require that what it's given be a valid XML 1.0 document, which
means that you have to have exactly one root element in what it's passed,
which does limit which kind of XML snippets you pass it, but it will work
for a lot of XML snippets as-is.

> - provide some way of hooking into non-default entities so that
>   DTD-defined entities can be expanded by the DTD implementation.  This
>   could be as simple as leaving such entities untouched in the returned
>   range, or invent a special EntityType representing such entities (with
>   a slice of the input containing the entity name) so that the DTD
>   implementation can insert the replacement text.

After having actually implemented full parsing for the entire DTD section
before figuring out that references could be inserted in it just about
anywhere and that the grammar in the spec is only the grammar _after_ all of
the replacements were made (when I figured that out was when I gave up on
DTD support), I would strongly argue in favor of simply passing along entity
references as-is and leaving any and all such processing to a DTD-enabled
parser. Originally, the Config had options like SkipDTD and SkipProlog, and
I even provided a way to get at the information in the <?xml...?>
declaration if you wanted it, all that just wasn't worth the extra
complexity.

- Jonathan M Davis



More information about the Digitalmars-d-announce mailing list