Why Strings as Classes?

BCS ao at pathlink.com
Tue Aug 26 11:15:51 PDT 2008


Reply to Benji,

> BCS wrote:
> 
>> Reply to Robert,
>> 
>>> BCS wrote:
>>> 
>>>> Reply to Benji,
>>>> 
>>>>> The new JSON parser in the Tango library operates on templated
>>>>> string arrays. If I want to read from a file or a socket, I have
>>>>> to first slurp the whole thing into a character array, even though
>>>>> the character-streaming would be more practical.
>>>>> 
>>>> Unless you are only going to parse the start of the file or are
>>>> going
>>>> to
>>>> be throwing away most of it *while you parse it, not after* The
>>>> best
>>>> way
>>>> to parse a file is to load it all in one OS system call and then
>>>> run
>>>> a
>>>> slicing parser (like the Tango XML parser) on that.
>>>> One memory allocation and one load or a mmap, and then only the
>>>> meta
>>>> structures get allocated later.
>>> There are cases where you might want to parse an XML file that won't
>>> fit easily in main memory. I think a stream processing SAX parser
>>> would be a good addition (perhaps not replacement for) the exiting
>>> one.
>>> 
>> If you can't fit the data file in memory the I find it hard to
>> believe you will be able to hold the parsed file in memory. If you
>> can program the parser to dump unneeded data on the fly or process
>> and discard the data, that might make a difference.
>> 
> Well, for something like a DOM parser, it's pretty much impossible to
> parse a file that won't fit into memory. But a SAX parser doesn't
> actually create any objects. It just calls events, while processing
> XML data from a stream. A good SAX parser can operate without ever
> allocating anything on the heap, leaving the consumer to create any
> necessary objects from the parse process.
> 
> --benji
> 

Interesting, I've worked with parsers* that function something like that 
but never thought of them in that way. OTOH I can think of only very limited 
domain where this would be useful. If I needed to process that much data 
I'd load it into a database and go from there.

*In fact my parser generator could be used that way.





More information about the Digitalmars-d mailing list