string-ish range/stream from curl ubyte[] chunks?
    Vlad via Digitalmars-d-learn 
    digitalmars-d-learn at puremagic.com
       
    Fri May 16 13:57:41 PDT 2014
    
    
  
Hello D programmers,
I am toying with writing my own HTML parser as a pet project, and 
I strive to have a range API for the tokenizer and the parser 
output itself.
However it occurs to me that in real-life browsers the advantage 
of this type of 'streaming' parsing would be given by also having 
the string that plays as input to the tokenizer treated as a 
'stream'/'range'.
While D's *string classes do play as ranges, what I want to write 
is a 'ChunkDecoder' range that would take curl 'byChunk' output 
and make it consumable by the tokenizer.
Now, the problem: string itself has ElementType!string == dchar. 
Consuming a string a dchar at a time looks like a wasteful 
operation if e.g. your string is UTF-8 or UTF-16.
So, naturally, I would like to use indexOf() - instead of 
countUntil() - and opSlice (without opDollar?) on my ChunkDecoder 
(forward) range.
Q: Is anything like this already in use somewhere in the standard 
library or a project you know?
Q2: Or do you have any pointers for what the smallest API would 
be for a string-like range class?
And bonus:
Q3: any uses of such a string-ish range in other standard library 
methods that you can think of and could be contributed to? e.g. 
suppose this doesn't exist and I / we come up with a proposal of 
minimal API to consume a string from left to right.
Thanks for your time and your suggestions!
    
    
More information about the Digitalmars-d-learn
mailing list