streaming redux

Robert Jacques sandford at jhu.edu
Tue Dec 28 16:23:01 PST 2010


On Tue, 28 Dec 2010 00:02:29 -0700, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> I've put together over the past days an embryonic streaming interface.  
> It separates transport from formatting, input from output, and buffered  
> from unbuffered operation.
>
> http://erdani.com/d/phobos/std_stream2.html
>
> There are a number of questions interspersed. It would be great to start  
> a discussion using that design as a baseline. Please voice any related  
> thoughts - thanks!
>
>
> Andrei

Here are my initial thoughts and responses to the questions. Now to go  
read everyone else's.

Re: TransportBase
Q1: Internally, I think it is a good idea for transport to support lazy  
opening, but I'm not sure of the hassle/benefit reward for exposing this  
to user code. If open is supported, I don't think it should take any  
parameters.
Q2: If seek isn't considered universal, having a isSeekable and rewind,  
might be beneficial. But while I know of transports where seeking might be  
slow, I'm not sure which one wouldn't support it at all, or only support  
rewind.
Q3: Yes, to seek + tell and getting rid of seekFromXXX.

Re: UnbufferedInputTransport
Q1: I think that read should be allowed to return less than buffered  
length, but since the transport should know the most efficient way to  
block on an input, I don't think returning a length zero array is valid.

Re: BufferedInputTransport
Q1: I think it's valid for the front of a buffer input to be empty: an  
empty front simply means that popFront should be called. popFront should  
be required to fill at least some of front (See UnbufferedInputTransport  
Q1)

Q2: Semantically, 'advance' feels to like popFront: I want to advance my  
input and I'm intending to work with it. The seek routines, on the other  
hand feel more like indexing: I want to do something with that index, but  
I do not necessarily need everything in between. In particular, I'd expect  
long seeks to reduce the front array to a zero elements, while I'd expect  
advance to enlarge the internal buffer if necessary.

Re: Formatter
Q1: I don't think formatters should be responsible for buffering, but  
certain formats require rather extensive buffering that can't be provided  
by the current buffer transport classes. (BSON comes to mind). My initial  
impression is that seek, etc should be able to handle these use cases, but  
adding a buffer hint setter/getter might be a good idea. The idea being  
that if the formatter knows that it will come back to this part of the  
stream, it can set a hint, so the buffer can make a more intelligent  
choice of when/where to flush internally.
Q2: putln only makes sense in terms of text based streams, plus it adds a  
large number of methods to implement. So I'm a bit on the fence about it.  
I think writefln would be a better solution to a similar problem.
Q3: The issue I see with a reflection-based solution is that the runtime  
reflection system should respect the visibility of the member: i.e.  
private variables shouldn't be accessible. But to do effective  
serialization, private members are generally required. As for the more  
technical aspects, combining __traits(derivedMembers,T) and  
BaseClassesTuple!T can determine which objects overload toString, etc.
Q4: Reading/writting the same sub-object is an internal mater, in my  
opinion. The really important aspect is handling slices, etc nicely for  
formats that support cyclic graphs. For which, the only thing missing is  
put(void*) to handle pointers (I think).
Q5: I think handling AA's with hooks is the best case with this design,  
though I only see a need for start and end. The major issue is that  
reading should be done as a tuple, which basically breaks the interface  
idiom. Alternatively, callbacks could be used to set read's mode: i.e.  
readKeyMode, readValueMode & putKeyMode, putValueMode.
Q6: Well, toString and cast(int/double/etc), should go a long way to  
covering most of the printf specifiers
Q7: Yes, writefln should probable be supported for text based transport.

Re: Unformatter
Q1: Implementations should be free (and indeed encouraged) to minimize  
allocations by returning a reusable buffer for arrays. So the stream  
should be responsible for inferring the size of an array.
Q2: See Formatter Q3.
Q3: See Formatter Q5.


Other Formatter/Unformatter thoughts:
For objects, several formats also require additional meta information  
(i.e. a unique string id, member offset position, etc), while others don't.


More information about the Digitalmars-d mailing list