Range interface for std.serialization

Dmitry Olshansky dmitry.olsh at gmail.com
Wed Aug 28 02:58:07 PDT 2013


28-Aug-2013 11:13, Jacob Carlborg пишет:
> On 2013-08-27 22:12, Dmitry Olshansky wrote:
>
>> I see...
>> That depends on the format and for these that have no keys or markers of
>> any kind versioning might help here. For instance JSON/BSON could handle
>> permutation of fields, but I then it falls short of handling links e.g.
>> pointers (maybe there is a trick to get it, but I can't think of any
>> right away).
>
> For pointers and reference types I currently serializing all fields with
> an id then when there's a pointer or reference I can just do this:
>
> <int name="foo" id="1">3</int>
> <pointer name="bar">1</pointer>

That would be tricky in JSON and quite overheadish (e.g. wrapping 
everything into object just in case there is a pointer there).

>> I suspect it would be best to somehow see archives by capbilities:
>> 1. Rigid (most binary) - in-order, depends on the order of fields, may
>> need to fit a scheme (in this cases D types implicitly define one)
>> Rigid archivers may also enjoy (per format in the future) a code
>> generator that given a scheme defines D types with a bit of CTFE+mixin.
>>
>> 2. Flexible - can survive reordering, is scheme-less, data defines
>> structure etc. easer handles versioning e.g. XML is one.
>
> Yes, that's a good idea. In the binary archiver I'm working on I'm
> cheating quite a bit and relax the requirements made by the serializer.

Yes, instead of cheating you can just define them as different kinds. It 
would ease the friction and prevent some "impedance mismatch" problems.

>> This also neatly answers the question about scheme vs scheme-less
>> serialization. Protocol buffers/Thrift may be absorbed into Rigid
>> category if we can get the versioning right. Also solving versioning is
>> the last roadblock (after ranges) mentioned on the path to making this
>> an epic addition to Phobos.
>
> Versioning shouldn't be that hard, I think.

Then collect some info on how to approach this problem.
See e.g. Boost serialziation, Protocol Buffers and Thrift.
The key point is that it's many things to many different people.

>> Was it DOM-ish too?
>
> Yes.

That nails it. DOM isn't quite serialization but rather a hierarchical 
DB. BTW Sqlite and other DBs may be an interesting backend for 
serialization (though they wouldn't have lookup untill deserialization).

>> Yeah, I see, but it's still a call to delegate that's hard to inline
>> (well LDC/GDC might). Would it be hard to do a compile-time check if
>> there are any events with the type in question at all and then call
>> triggerEvent(s)?
>
> No, I don't think so. I can also make the triggerEvents take the
> delegate by alias parameter, if that helps. Or inline it manually.

Great, anything to lessen the extra load.

>> While we are on the subject of delegates - you absolutely should use
>> 'scope delegate' as most (all?) delegates are never stored anywhere but
>> rather pass blocks of code to call deeper down the line.
>> (I guess it's somewhat Ruby-style, but it's not a problem).
>
> Good idea. The reasons for the delegates is to avoid begin/end
> functions. This also forces the use of the API correctly. Hmm, actually
> it may not. Since the Serializer technically is the user of the archiver
> API and that is already correctly implemented. The developer do need to
> implement the archiver API correctly, but there's nothing that stops
> him/her from _not_ calling the delegate. Am I over thinking this?

Seems like, after all library implementors should be trusted to not do 
truly awful things.

>
>> Aye, as any faithful Phobos dev absolutely :)
>> Seriously though ATM I just _suspect_ there is no need for Archive to be
>> an interface. I would need to think this bit through more deeply but
>> virtual call per field alone make me nervous here.
>
> Originally it was using templates. One of my design goals back then was
> to not have to use templates. Templates forces slightly more complicated
> API for the user:
>
> auto serializer = new Serializer!(XmlArchive);
>
> Which is fine, but I'm not very about the API for custom serialization:
>
> class Foo
> {
>      void toData (Archive) (Serializer!(Archive) serializer);
> }
>

Rather this:

void toData(Serializer)(Serializer serializer)
	if(isSerializer!Serializer)
{
	...
}

There is no need to even know how archiver looks like for the user code 
(wasn't it one of the goals of archivers?).

> The user is either forced to use templates here as well, or:
>
> class Foo
> {
>      void toData (Serializer!(XmlArchive) serializer);
> }

The main problem would be that it can't overriden as templates are final.

After all of this I think Archivers are just fine as templates user only 
ever interacts with them during creation. Then it's serializers 
templates that pick up the right types.

Serializers themselves on the other hand are present in user code and 
may need one common polymorphic abstract class that provides 'put' and 
forwards it to a set of abstract methods. All polymorphic wrappers would 
inherit from it.

This won't prevent folks from using templated version of toData/fromData 
if need be.

> ... use a single type of archive. It's also possible to pass in anything
> as Archive. Now we have template constraints, which didn't exist back
> then, make it a bit better.
>
> About the large API to implement for an Archive, this is the criteria I
> had when creating the API, in order of importance.
>
> 1. Should be easy for a consumer to use
> 2. Should be easy for an archive implementor
> 3. Should be easy to implement the serializer
>
> In this case, point 1 made it less easy for point 2. Point 2 made me
> push as much as possible to the serializer instead of having it in the
> archiver.
>

I'd suggest to maximally hide away (Un)Archivers API from end users and 
as such it would be more convenient to just stay templated as it won't 
be seen.

> In the end, it's quite easy to copy-paste the API, do some search and
> replace and forward methods like these:
>
> void archiveEnum (bool value, string baseType, string key, Id id)
> void archiveEnum (char value, string baseType, string key, Id id)
> void archiveEnum (int value, string baseType, string key, Id id)
>
> ... to a private template method. That's what XmlArchive does:
>
> https://github.com/jacob-carlborg/orange/blob/master/orange/serialization/archives/XmlArchive.d#L439
>



-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list