streaming redux
Michel Fortin
michel.fortin at michelf.com
Tue Dec 28 09:39:07 PST 2010
On 2010-12-28 02:02:29 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> said:
> I've put together over the past days an embryonic streaming interface.
> It separates transport from formatting, input from output, and buffered
> from unbuffered operation.
>
> http://erdani.com/d/phobos/std_stream2.html
>
> There are a number of questions interspersed. It would be great to
> start a discussion using that design as a baseline. Please voice any
> related thoughts - thanks!
One of my concerns is the number of virtual calls required in actual
usage, because virtual calls prevent inlining. I know it's necessary to
have virtual calls in the formatter to serialize objects (which
requires double dispatch), but in your design the underlying transport
layer too wants to be called virtually. How many virtual calls will be
necessary to serialize an array of 10 objects, each having 10 fields?
Let's see:
10 calls to Formatter.put(Object)
+ 10 calls to Object.toString(Formatter)
+ 10 objects * 10 calls per object to Formatter.put(<some field type>)
+ 10 objects * 10 calls per object to
UnbufferedOutputTransport.write(in ubyte[])
Total: 220 virtual calls, for 10 objects with 10 fields each. Most of
the functions called virtually here are pretty trivial and would
normally be inlined if the context allowed it. Assuming those fields
are 4 byte integers and are stored as is in the stream, the result will
be between 400 and 500 byte long once we add the object's class name.
We end up having almost 1 virtual call for each two byte of emitted
data; is this overhead really acceptable? How much inlining does it
prevent?
My second concern is that your approach to Formatter is too rigid. For
instance, what if an object needs to write different fields depending
on the output format, or write them in a different order? It'll have to
check at runtime which kind of formatter it got (through casts
probably). Or what if I have a formatter that wants to expose an XML
tree instead of bytes? It'll need a totally different interface that
deals with XML elements, attributes, and character data, not bytes.
So because of all this virtual dispatch and all this rigidity, I think
Formatter needs to be rethought a little. My preference obviously goes
to satically-typed formatters. But what I'd like to see is something
like this:
interface Serializable(F) {
void writeTo(F formatter);
}
Any object can implement a serialization for a given formatter by
implementing the interface above parametrized with the formatter type.
(Struct types could have a similar writeTo function too, they just
don't need to implement an interface.) The formatter type can expose
the interface it wants and use or not use virtual functions, it could
be an XML writer interface (something with openElement,
writeCharacterData, closeElement, etc), it could be a JSON interface;
it could even be your Formatter as proposed, we just wouldn't be
limited by it.
So basically, I'm not proposing you dump Formatter, just that you make
it part of a reusable pattern for
formatting/serializing/unformatting/unserializing things using other
things that your Formatter interface.
As for the transport layer, I don't mind it much if it's an interface.
Unlike Formatter, nothing prevents you from creating a 'final' class
and using it directly when you can to avoid virtual dispatch. This
doesn't work so well for Formatter however because it requires double
dispatch when it encounters a class, which washes away all static
information.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list