Range interface for std.serialization
Tyler Jameson Little
beatgammit at gmail.com
Thu Aug 22 19:06:33 PDT 2013
On Thursday, 22 August 2013 at 14:48:57 UTC, Dicebot wrote:
> On Thursday, 22 August 2013 at 03:13:46 UTC, Tyler Jameson
> Little wrote:
>> On Wednesday, 21 August 2013 at 20:21:49 UTC, Dicebot wrote:
>>> It should be range of strings - one call to popFront should
>>> serialize one object from input object range and provide
>>> matching string buffer.
>>
>> I don't like this because it still caches the whole object
>> into memory. In a memory-restricted application, this is
>> unacceptable.
>
> Well, in memory-restricted applications having large object at
> all is unacceptable. Rationale is that you hardly ever want
> half-deserialized object. If environment is very restrictive,
> smaller objects will be used anyway (list of smaller objects).
It seems you and I are trying to solve two very different
problems. Perhaps if I explain my use-case, it'll make things
clearer.
I have a server that serializes data from a socket, processes
that data, then updates internal state and sends notifications to
clients (involves serialization as well).
When new clients connect, they need all of this internal state,
so the easiest way to do this is to create one large object out
of all of the smaller objects:
class Widget {
}
class InternalState {
Widget[string] widgets;
... other data here
}
InternalState isn't very big by itself; it just has an
associative array of Widget pointers with some other rather small
data. When serialized, however, this can get quite large. Since
archive formats are orders of magnitude less-efficient than
in-memory stores, caching the archived version of the internal
state can be prohibitively expensive.
Let's say the serialized form of the internal state is 5MB, and I
have 128MB available, while 50MB or so is used by the
application. This leaves about 70MB, so I can only support 14
connected clients.
With a streaming serializer (per object), I'll get that 5MB down
to a few hundred KB and I can support many more clients.
>> ...
>> There's no reason why the serializer can't output this in
>> chunks
>
> Outputting on its own is not useful to discuss - in pipe model
> output matches input. What is the point in outputting partial
> chunks of serialized object if you still need to provide it as
> a whole to the input?
This only makes sense if you are deserializing right after
serializing, which is *not* a common thing to do.
Also, it's much more likely to need to serialize a single object
(as in a REST API, 3d model parser [think COLLADA] or config
parser). Providing a range seems to fit only a small niche,
people that need to dump the state of the system. With
single-object serialization and chunked output, you can define
your own range to get the same effect, but with an API as you
detailed, you can't avoid memory problems without going outside
std.
More information about the Digitalmars-d
mailing list