Why Strings as Classes?

superdan super at dan.org
Tue Aug 26 11:50:29 PDT 2008


Benji Smith Wrote:

> BCS wrote:
> > Reply to Benji,
> >> Well, for something like a DOM parser, it's pretty much impossible to
> >> parse a file that won't fit into memory. But a SAX parser doesn't
> >> actually create any objects. It just calls events, while processing
> >> XML data from a stream. A good SAX parser can operate without ever
> >> allocating anything on the heap, leaving the consumer to create any
> >> necessary objects from the parse process.
> >>
> >> --benji
> >>
> > 
> > Interesting, I've worked with parsers* that function something like that 
> > but never thought of them in that way. OTOH I can think of only very 
> > limited domain where this would be useful. If I needed to process that 
> > much data I'd load it into a database and go from there.
> > 
> > *In fact my parser generator could be used that way.
> 
> In fact, that's one of the places where I've used this kind of parsing 
> technique before.
> 
> I wrote a streaming CSV parser (which takes discipline to do correctly, 
> since a double-quote enclosed field can legally contain arbitrary 
> newline characters, and quotes are escaped by doubling). It provides a 
> field callback and a record callback, so it's very handy for performing 
> ETL tasks.
> 
> If I had to load the whole CSV files into memory before parsing, it 
> wouldn't work, because sometimes they can be hundreds of megabytes. But 
> the streaming parser takes up almost no memory at all.
> 
> --benji

sure it takes very little memory. i'll tell u how much memory u need in fact. it's the finite state needed by the fsa. u could do that because csv only needs finite state for parsing. soon as you need to backtrack stream parsing becomes very difficult.



More information about the Digitalmars-d mailing list