Request for review - std.serialization (orange)

Jacob Carlborg doob at me.com
Mon Mar 25 01:53:30 PDT 2013


On 2013-03-25 02:16, Manu wrote:
> Just at a glance, a few things strike me...
>
> Phobos doesn't typically use classes, seems to prefer flat functions.

It's necessary to have a class or struct to pass around. The serializer 
is passed to method/functions doing custom serialization. I could create 
a free function that encapsulates the classes for the common use cases.

> Are we happy with classes in this instance?
> Use of caps in the filenames/functions is not very phobos like.

Yeah, that will be fixed if accepted. As you see, it's still a separate 
library and not included into Phobos.

> Can I have a post-de-serialise callback to recalculate transient data?

Yes. There are three ways to custom the serialization process.

1. Take complete control of the process (for the type) by adding 
toData/fromData to your types

https://github.com/jacob-carlborg/orange/wiki/Custom-Serialization

2. Take complete control of the process (for the type) by registering a 
function pointer/delegate as a serializer for a given type. Useful for 
serializing third party types

https://github.com/jacob-carlborg/orange/wiki/Non-Intrusive-Serialization

3. Add the onDeserialized attribute to a method in the type being serialized

https://github.com/jacob-carlborg/orange/blob/master/tests/Events.d#L75
https://dl.dropbox.com/u/18386187/orange_docs/Events.html

I noticed that the documentation for the attributes don't look so good.

> Why register serialiser's, and structures that can be operated on? (I'm
> not a big fan of registrations of this sort personally, if they can be
> avoided)

The only time when registering a serializer is really necessary is when 
serializing through a base class reference. Otherwise the use cases are 
when customizing the serialization process.

> Is there a mechanism to deal with pointers, or do you just serialise
> through the pointer? Some sort of reference system so objects pointing
> at the same object instance will deserialise pointing at the same object
> instance (or a new copy thereof)?

Yes. All references types (including pointers) are only serialized ones. 
If a pointer, that is serialized, is pointing to data not being 
serialized it serialize what it's pointing to as well.

If you're curious about the internals I suggest you serialize some 
class/strcut hierarchy and look at the XML data. It should be readable.

> Is it fast? I see in your custom deserialise example, you deserialise
> members by string name... does it need to FIND those in the stream by
> name, or does it just use that to validate the sequence?

That's up to the archive how to implemented. But the idea is that it 
should be able to find by name in the serialized data. That is kind of 
an implicit contract between the archive and the serializer.

> I have a serialiser that serialises in realtime (60fps), a good fair few
> megabytes of data per frame... will orange handle this?

Probably not. I think it mostly depends on the archive used. The XML 
module in Phobos is really, REALLY slow. Serializing the same data with 
Tango (D1) is at least twice as fast. I have started to work on an 
archive type that just tries to be as fast as possible. That:

* Break the implicit contract with the serializer
* Doesn't care about endians
* Doesn't care if the fields have changed
* May not handle slices correctly
* And some other things

> Documentation, what attributes are available? How to use them?

https://dl.dropbox.com/u/18386187/orange_docs/Events.html
https://dl.dropbox.com/u/18386187/orange_docs/Serializable.html

Is this clear enough?

> You only seem to provide an XML backend. What about JSON? Binary (with
> endian awareness)?

Yeah, that is not implemented yet. Is it necessary before adding to to 
Phobos?

> Writing an Archiver looks a lot more involved than I would have
> imagined. XmlArchive.d is huge, mostly just 'ditto'.
> Should unarchiveXXX() not rather be unarchive!(XXX)(), allowing to
> minimise most of those function definitions?

Yeah, it has kind of a big API. The reason is to be able to use 
interfaces. Seriailzer contains a reference to an archive, typed as the 
interface Archive. If you're using custom serialization I don't think it 
would be good to lock yourself to a specific archive type.

BTW, unarchiveXXX is forwarded to a private unarchive!(XXX)() in XmlArchive.

With classes and interfaces:

class Serializer
interface Archive
class XmlArchive : Archive

Archive archive = new XmlArchive;
auto serializer = new Serializer(archive);

struct Foo
{
     void toData (Serializer serializer, Serializer.Data key);
}

With templates:

class Serializer (T)
class XmlArchive

auto archive = new XmlArchive;
auto serializer = new Serializer!(XmlArchive)(archive);

struct Foo
{
     void toData (Serializer!(XmlArchive) serializer, Serializer.Data key);
}

Foo is now locked to the XmlArchive. Or:

class Bar
{
     void toData (T) (Serializer!(T) serializer, Serializer.Data key);
}

toData cannot be virtual.

-- 
/Jacob Carlborg


More information about the Digitalmars-d mailing list