dxml 0.2.0 released

H. S. Teoh hsteoh at quickfur.ath.cx
Mon Feb 12 15:59:24 UTC 2018


On Mon, Feb 12, 2018 at 07:04:38AM -0700, Jonathan M Davis via Digitalmars-d-announce wrote:
[...]
> However, if folks as a whole think that Phobos' xml parser needs to
> support the DTD section to be acceptable, then dxml won't replace
> std.xml, because dxml is not going to implement DTD support. DTD
> support fundamentally does not fit in with dxml's design.

Actually, thinking about this, I'm wondering if a combination of
preprocessing and/or postprocessing might make it possible to implement
DTD support without needing to rewrite the guts of dxml. AIUI, dxml does
parse the DTD section correctly, i.e., as an XML directive, but only
doesn't look into its internal details. So one way to implement DTD
support might be:

- Write an auxiliary parser that's basically a wrapper around dxml,
  forwarding XML events to the caller, except:
- If a DTD event is encountered, eagerly parse it, store DTD
  declarations internally for future reference.
- If there's a DTD that has been seen, perform on-the-fly validation as
  XML events are forwarded.
- In PCDATA sections, if there are entity references to the DTD, expand
  them, possibly inserting more XML events into the stream based on
  what's defined in the DTD. (This may need to reuse some dxml internals
  to parse XML snippets that might be contained in an entity definition,
  for example.)


[...]
> However, std.xml does not support the DTD section, and glancing over
> it, it doesn't look like it even handles skipping the DTD section
> properly (it doesn't handle the fact that '>' can appear within quoted
> sections within the DTD). So, dxml is not worse than std.xml in that
> regard, and we wouldn't lose any functionality by having dxml replace
> std.xml. It just wouldn't necessarily do as much as some folks might
> like.
[...]

If std.xml currently does not support DTDs, then I say dxml is
definitely a Phobos candidate.  At the very least, it does not make the
current situation worse.  Rejecting dxml because it doesn't support DTDs
is basically letting the perfect be the enemy of the good, which is
something this community has been plagued with for far too long.  What's
worse: a std.dxml that doesn't support DTDs, or a std.xml with
fundamental problems that continue to plague us for the next decade
while nobody else steps up to implement a suitable replacement?


T

-- 
Ph.D. = Permanent head Damage


More information about the Digitalmars-d-announce mailing list