Recommendations on parsing XML via an InputRange

Steven Schveighoffer schveiguy at gmail.com
Tue Sep 14 12:21:53 UTC 2021


On 9/13/21 10:43 PM, Chris Piker wrote:
> Hi D
> 
> I just finished a ~1K line project using `dxml` as the XML reader for my 
> data streams.  It works well in my test examples using memory mapped 
> files, but like an impulse shopper I didn't notice that dxml requires 
> `ForwardRange` objects.  That's unfortunate, because my next enhancement 
> was to start parsing streams as they come in from stdin. (doh!)
> 
> So I've learned my lesson and will RTFM closer next time, but now I'm 
> casting about for a solution.  Two ideas, either:
> 
> 1. Find a different StAX-ish parser that works with `InputRange` (and 
> buffers internally a bit if needed), or
> 
> 2. Find a way to represent standard input as a ForwardRange without 
> saving the whole stream in memory. (iopipe?)

Iopipe is no better than an input range unless you plan to read the 
whole stream into a buffer.

A forward range is required because dxml uses saved ranges to refer to 
previous data. This requires the whole thing to be stored in memory.

I've thought of building an xml parser on top of iopipe, and I probably 
will some day (maybe a port of dxml). The iopipejson library does not 
require the whole thing to be in memory, and has some facilities to pin 
parsed data to jump back. I imagine something like that is doable for 
xml, but probably just storing current element ancestry while parsing 
(probably off to the side in another stack-like thing).

-Steve


More information about the Digitalmars-d-learn mailing list