My Kingdom For ...

Janice Caron caron800 at googlemail.com
Thu Feb 21 23:27:22 PST 2008


On 22/02/2008, Derek Parnell <derek at nomail.afraid.org> wrote:
>  > Have you seen std.xml?
>
> Which by the way is quite nice code and works well. (I was wondering how it
>  can be used to generate a DTD file from an XML document though.)

Right now, it can't. That's a "future direction" for it. Basically,
right now, it can't parse inside so-called "XML instructions" - those
weird things that you see in the prolog that start with "<!", like
<!DOCTYPE...> and <!ATTLIST...> and <!...ENTITY>. Currently, it treats
those things like opaque blobs and doesn't bother to look inside them.
DTDs (which are specified inside the <!DOCTYPE...> instruction) are a
whole level of complexity above and beyond XML itself - and not to
mention, now pretty much obsolete, having already been replaced by XML
Schema.

There are quite a few things I want to add to that module in time,
including (in no particular order) full support for XML namespaces,
DTDs and XML schemas (both full validation and creation thereof), XSL
style sheets, and so on. But I think the next thing I want to tackle
with regard to XML is the encoding. If a document starts "<?xml
version="1.0" encoding="ISO-8859-1"> then we're screwed, because D
only handles Unicode. Even reading in the document is hard! Tackling
that is next on the list, to be (shortly) followed by namespaces, and
probably the more tricky stuff some time after that.

But even without those things, it's a neat module to use. I was using
it before it went public, and the really nice thing about the parser
is that it makes use of closures. The handlers you give it can all use
anonymous delegates, declared inline, and they can refer to variable
in the enclosing scope, and that is something that I just find
/amazing/, and cheers to Walter for that! It's really that that makes
the parser such a joy to use, compared with, say, Expat. (Expat is
written in C, so obviously it can't have closures).

The module could really do with a proper tutorial, and maybe I'll
write one soon, but I'm hampered because all the auto-generated docs
have to come from ddoc comments and it's a bit limiting what you can
do with that (and sometimes it does the most bizarrely unexpected
things). I'd love to be able to put HTML into a ddoc ... Wait -
there's an idea for a feature request! Hmm... another thread coming
up!

Anyway, while we're venturing far into the realms of topic drift,
might I also mention that you should all look at where std.algorithm
is at now. That has really, really matured with D2.011 and it is now
absolutely fantastic. Thanks, Andrei.



More information about the Digitalmars-d mailing list