serialization library

Bill Baxter dnewsgroup at
Wed Nov 8 17:06:21 PST 2006

Christian Kamm wrote:
> Based on initial work from Tom S and clayasaurus, I've written this 
> serialization library. If hope something like this doesn't already exist!


> Currently, it only provides binary file io through the Serializer class. 
> It can
> - write/read almost (hopefully) every type through a call to 
> Serializer.describe
> - track class references and pointers by default
> - serialize classes and structs through a templated 'describe' member 
> function
> - write derived classes from base class reference*
> - read derived classes into base class reference*
> - serialize not default constructible classes*
> (* for this to work, the class needs to be registered with the archive 
> type)
> It has far less features than boost::serialization but is already in a 
> very usable state: FreeUniverse, a D game based on the Arc library, uses 
> it for writing and loading savegames as well as other persistant state 
> information.

I'm using Boost::serialization but I'm not at all happy with it.  But 
the things that I don't like mostly have to do with versioning, which it 
looks like you don't support anyway.

> What it does not do/is missing:
> - exception safety / multithread safety
> - out-of-class/struct serialization methods (is it possible to check 
> whether a specific overload exists at compile time?)

I could be mistaken but I think this is that ADL / Koenig Lookup 
territory that Walter doesn't want go into.

> - static arrays need to be serialized with describe_staticarray (static 
> arrays can't be inout, so the general-purpose template method doesn't 
> work... is there a way around the problem?)
> - things I forgot right now

Endian issues?

> Documentation is still rather sparse. This short example shows the basic 
> usage

Just a wish list item, but I'd prefer an actual "file format" library as 
opposed to a serialization library.  Maybe a file format library would 
build on top of the serialization library, but anyway, the key 
difference is that a serialization lib aims to turn *particular* data 
structures into a binary format that can be losslessly loaded back into 
the same data structure later.

But that is not the way people design generic file formats, like say the 
Photoshop file format.  Things like that need to be very extensible and 
shouldn't be tied to particular data structures.  I think that's where 
boost::serialization gets into trouble.  Once you start talking about 
versioning, you're no longer talking about one specific data structure.

For instance Boost::serialization lacks a way to ignore blocks or skip 
chunks of data that are not recognized or obsolete.  You actually have 
to load the obsolete thing into the proper (possibly obsolete) data 
structure and then delete the unnecessary thing you just created.  This 
is not good from the forwards/backwards compatibility view.  Old code 
simply cannot read the file (even if it understands the majority of the 
chunks that matter), and new code is forced to maintain old data 
structures just for the purpose of loading up obsolete data and throwing 
it away.

How do you fix it?  Very simple really.  Just store the file as a series 
of chunks with fixed length headers, and each header contains the length 
of the data in that chunk.  If you get a chunk header with a tag you 
don't understand, just ignore it.  A particular chunk can have 
sub-chunks too.  I think it's similar in many ways to a grammar definition:

     header chunklist

     chunk chunklist

     typeIndicator versionNumber DataEndianness

     chunkHeader data

     chunkType DataLength

     // Here's where you list all the types of data known to you

Or something like that.
I'd like a library that helps me read and write my data in that sort of 
data-structure independent format.


More information about the Digitalmars-d-announce mailing list