How to simply parse and print the XML with dxml?

tastyminerals tastyminerals at gmail.com
Fri Sep 10 07:50:29 UTC 2021


On Thursday, 9 September 2021 at 18:40:53 UTC, jfondren wrote:
> On Thursday, 9 September 2021 at 17:17:23 UTC, tastyminerals 
> wrote:
>> [...]
>
> dxml.parser is a streaming XML parser. The documentation at 
> http://jmdavisprog.com/docs/dxml/0.4.0/dxml_parser.html has a 
> link to more information about this at the top, behind 'StAX'. 
> Thus, when you're mapping over `xml`, you're not getting 
> `<a>some text</a>` at a time, but `<a>`, `some text`, and 
> `</a>` separately, as they're parsed. The `<a>` there is an 
> `elementStart` which lacks a `text`, hence the error.
>
> Here's a script:
>
> ```d
> #! /usr/bin/env dub
> /++ dub.sdl:
>     dependency "dxml" version="0.4.0"
>     stringImportPaths "."
> +/
> import dxml.parser;
> import std;
>
> enum text = import(__FILE__)
>     .splitLines
>     .find("__EOF__")
>     .drop(1)
>     .join("\n");
>
> void main() {
>     foreach (entity; parseXML!simpleXML(text)) {
>         if (entity.type == EntityType.text)
>             writeln(entity.text.strip);
>     }
> }
> __EOF__
> <!-- comment -->
> <root>
>     <foo>some text<whatever/></foo>
>     <bar/>
>     <baz></baz>
>     more text
> </root>
> ```
>
> that runs with this output:
>
> ```
> some text
> more text
> ```

Ok, that makes sense now. Thank you.

As for the dxml, I believe adding a small quick start example 
would be very beneficial for the newcomers. Especially, ppl like 
me who are not aware of the XML parser types and just need to 
extract text from an XML file.


More information about the Digitalmars-d-learn mailing list