[GSoC] Improved FlatBuffers and/or Protobuf Support ~ Binary Serialization
Dragos Carp
dragoscarp at gmail.com
Fri Mar 29 23:19:10 UTC 2019
Hi Ahmet,
welcome to the D forum.
As the author of protobuf-d I'll try to give you some feedback to
the points you made. I couldn't find the time to also do the
flatbuffers implementation, so my comments are related just to
protobuf. If you are interested to do the Flatbuffers work, I'll
be more than happy to play the mentor role for you - I have some
ideas there. But let's get to the existing, real stuff.
On Friday, 29 March 2019 at 00:18:40 UTC, Ahmet Sait wrote:
>
> - It should be possible to parse schema and output mixable D
> code at
> compile time
> const schema = `message Person
> {
> required string name = 1;
> required int32 id = 2;
> }`;
> mixin(fromProtoSchema(schema));
I don't think that it is worth the effort.
1. A complete implementation for .proto file parsing is
complicated
(https://developers.google.com/protocol-buffers/docs/reference/proto3-spec).
2. Theoretically, protobuf definitions does not change often, and
considering that compile time parsing is somehow slow, the
benefit of parsing them at every compilation is actually a
drawback.
3. protoc plugin is the Protobuf recommended way of parsing
.proto definitions:
https://developers.google.com/protocol-buffers/docs/proto3#generating
>
> - There should be no need for a schema definition, a custom
> type annotated
> with UDAs should be enough
> struct Person
> {
> @protoID(1) string name;
> @protoID(2) int age;
> }
> serialize(Person("Walter", 42), stdout);
protobuf-d does that already, see the unittest for toProtobuf:
https://github.com/dcarp/protobuf-d/blob/3f8a1a5129c98920e1652e965004ac77e9bb8ef1/src/google/protobuf/encoding.d#L193
>
> - Simple things should be simple
> It should be dead simple to do basic stuff:
> auto obj = deserialize!SomeType(stdin);
> serialize(obj, stdout);
Again, protobuf-d has that:
https://github.com/dcarp/protobuf-d/blob/3f8a1a5129c98920e1652e965004ac77e9bb8ef1/src/google/protobuf/decoding.d#L214
>
> - Complex things should be possible
> The library should be flexible and extensible without
> modification
toProtobuf, fromProtobuf, toJSONValue, fromJSONValue methods are
protobuf customization points in protobuf-d. For an example see
https://github.com/dcarp/protobuf-d/blob/3f8a1a5129c98920e1652e965004ac77e9bb8ef1/src/google/protobuf/wrappers.d#L27-L54
>
> - Support for library and tool based usage
> It should be usable as a library without any additional setup
> but also usable
> as a schema compiler.
protobuf-d is usable as library, see
https://github.com/huntlabs/grpc-dlang/blob/57c8fe9808f8e860c4b0668a83cdabd78b296ce5/dub.json#L9
Regarding the usage as schema compiler, review the first comment.
>
> - Support for common Phobos types
> Nullable, tuples, std.datetime, std.complex, std.bigint,
> containers...
Protobuf is a language agnostic serialization format. Having
.protobuf definitions for common Phobos types will just shift the
problem somewhere else (i.e. other programming languages).
Nevertheless Protobuf addresses probably the same problem by
defining the "well-known" types
(https://developers.google.com/protocol-buffers/docs/reference/google.protobuf).
protobuf-d also supports those, so that std.datetime.Systime is
mapped to google.protobuf.Timestamp and std.datetime.Duration to
google.protobuf.Duration
>
> I'm personally not happy with any of the existing libraries but
> they will
> likely be a valuable resource regardless.
The existing protobuf libraries are quite mature and probably
improving those will be time better spent than starting once
again from scratch.
>
> Questions:
> - How much work would be ideal for GSoC? Should I be working on
> flatbuffers
> only or protobuf too? (Seems like flatbuffers need more love)
I'm quite satisfied with protobuf-d implementation: it is small
(aprox. 4k LOC), clean and quite feature complete - 26 failing
conformance test vs. 27 resp. 41 for the official C++ and Java
counterparts. Of course there is still enough space for
improvement, but at least in case of protobuf-d not enough for a
GSoC application.
On the other hand Flatbuffers is a very good candidate: it has
its own specialties, but is also somehow similar to protobuf.
This would reduce the planning risks considerably.
> - Should I tackle the std.serialization [3] idea?
I see std.serialization as a high level API. Probably this will
be a long term std.experimental.serialization, that will require
quite some time till multiple serialization formats implements
it. Just after that, if it will ever happen, we can remove the
"experimental" part. I don't see this as a suited GSoC project.
> - Any other serialization related suggestions?
https://arrow.apache.org/
Cheers, Dragos
More information about the Digitalmars-d
mailing list