Range interface for std.serialization

Jacob Carlborg doob at me.com
Thu Aug 22 00:16:11 PDT 2013


On 2013-08-22 05:13, Tyler Jameson Little wrote:

> I don't like this because it still caches the whole object into memory.
> In a memory-restricted application, this is unacceptable.

It need to store all serialized reference types, otherwise it cannot 
properly serialize a complete object graph. We don't want duplicates. 
Example:

The following code:

auto bar = new Bar;
bar.a = 3;

auto foo = new Foo;
foo.a = bar;
foo.b = bar;

Is serialized as:

<object runtimeType="main.Foo" type="main.Foo" key="0" id="0">
     <object runtimeType="main.Bar" type="main.Bar" key="a" id="1">
         <int key="a" id="2">3</int>
     </object>
     <reference key="b">1</reference>
</object>

When "foo.b" is just serializes a reference, not the complete object, 
because that has already been serialized. The serializer needs to keep 
track of that.

> I think one call to popFront should release part of the serialized
> object. For example:
>
> struct B {
>      int c, d;
> }
>
> struct A {
>      int a;
>      B b;
> }
>
> The JSON output of this would be:
>
>      {
>          a: 0,
>          b: {
>              c: 0,
>              d: 0
>          }
>      }
>
> There's no reason why the serializer can't output this in chunks:
>
> Chunk 1:
>
>      {
>          a: 0,
>
> Chunk 2:
>
>          b: {
>
> Etc...

It seems hard to keep track of nesting. I can't see how pretty printing 
using this technique would work.

> This is just a read-only property, which arguably doesn't break
> misconceptions. There should be no reason to assign directly to a range.

How should I set the data used for deserializing?

> I agree that (de)serializing a large list of objects lazily is
> important, but I don't think that's the natural interface for a
> Serializer. I think that each object should be lazily serialized instead
> to maximize throughput.
>
> If a Serializer is defined as only (de)serializing a single object, then
> serializing a range of Type would be as simple as using map() with a
> Serializer (getting a range of Serialize). If the allocs are too much,
> then the same serializer can be used, but serialize one-at-a-time.
>
> My main point here is that data should be written as it's being
> serialized. In a networked application, it may take a few packets to
> encode a larger object, so the first packets should be sent ASAP.
>
> As usual, feel free to destroy =D

Again, how does one keep track of nesting in formats like XML, JSON and 
YAML?

-- 
/Jacob Carlborg


More information about the Digitalmars-d mailing list