How to simply parse and print the XML with dxml?

jfondren julian.fondren at gmail.com
Thu Sep 9 18:40:53 UTC 2021


On Thursday, 9 September 2021 at 17:17:23 UTC, tastyminerals 
wrote:
> Maybe I missed something obvious in the docs but how can I just 
> parse the XML and print its content?
>
> ```
> import dxml.parser;
>
> auto xml = parseXML!simpleXML(layout);
> xml.map!(e => e.text).join.writeln;
> ```
>
> throws 
> `core.exception.AssertError at ../../../.dub/packages/dxml-0.4.3/dxml/source/dxml/parser.d(1457): text cannot be called with elementStart`.

dxml.parser is a streaming XML parser. The documentation at 
http://jmdavisprog.com/docs/dxml/0.4.0/dxml_parser.html has a 
link to more information about this at the top, behind 'StAX'. 
Thus, when you're mapping over `xml`, you're not getting `<a>some 
text</a>` at a time, but `<a>`, `some text`, and `</a>` 
separately, as they're parsed. The `<a>` there is an 
`elementStart` which lacks a `text`, hence the error.

Here's a script:

```d
#! /usr/bin/env dub
/++ dub.sdl:
     dependency "dxml" version="0.4.0"
     stringImportPaths "."
+/
import dxml.parser;
import std;

enum text = import(__FILE__)
     .splitLines
     .find("__EOF__")
     .drop(1)
     .join("\n");

void main() {
     foreach (entity; parseXML!simpleXML(text)) {
         if (entity.type == EntityType.text)
             writeln(entity.text.strip);
     }
}
__EOF__
<!-- comment -->
<root>
     <foo>some text<whatever/></foo>
     <bar/>
     <baz></baz>
     more text
</root>
```

that runs with this output:

```
some text
more text
```


More information about the Digitalmars-d-learn mailing list