std.data.json formal review

Mon Aug 17 11:47:52 PDT 2015

On 17-Aug-2015 21:12, Andrei Alexandrescu wrote:
> On 8/14/15 7:40 AM, Andrei Alexandrescu wrote:
>> On 8/12/15 5:43 AM, Sönke Ludwig wrote:
>>>> Anyway, I've just started to work on a generic variant of an enum based
>>>> algebraic type that exploits as much static type information as
>>>> possible. If that works out (compiler bugs?), it would be a great thing
>>>> to have in Phobos, so maybe it's worth to delay the JSON module for
>>>> that
>>>> if necessary.
>>>>
>>>
>>> First proof of concept:
>>> https://gist.github.com/s-ludwig/7a8a60150f510239f071#file-taggedalgebraic-d-L148
>>>
>>>
>>>
>>>
>>> It probably still has issues with const/immutable and ref in some
>>> places, but the basics seem to work as expected.
>>
>> struct TaggedAlgebraic(U) if (is(U == union)) { ... }
>>
>> Interesting. I think it would be best to rename it to TaggedUnion
>> (instantly recognizable; also TaggedAlgebraic is an oxymoron as there's
>> no untagged algebraic type). A good place for it is straight in
>> std.variant.
>>
>> What are the relative advantages of using an integral over a pointer to
>> function? In other words, what's a side by side comparison of
>> TaggedAlgebraic!U and Algebraic!(types inside U)?
>>
>> Thanks,
>>
>> Andrei
>
> Ping on this. My working hypothesis:
>
> - If there's a way to make a tag smaller than one word, e.g. by using
> various packing tricks, then the integral tag has an advantage over the
> pointer tag.
>
> - If there's some ordering among types (e.g. all types below 16 have
> some property etc), then the integral tag again has an advantage over
> the pointer tag.
>
> - Other than that the pointer tag is superior to the integral tag at
> everything. Where it really wins is there is one unique tag for each
> type, present or future, so the universe of types representable is the
> total set. The pointer may be used for dispatching but also as a simple
> integral tag, so the pointer tag is a superset of the integral tag.
>
> I've noticed many people are surprised by std.variant's use of a pointer
> instead of an integral for tagging. I'd like to either figure whether
> there's an advantage to integral tags, or if not settle for good a
> misconception.
>

Actually one can combine the two:
- use integer type tag for everything built-in
- use pointer tag for what is not

In code:
union NiftyTaggedUnion
{
  	// pointer must be at least 4-byte aligned
	// To discern int tag must have the LSB == 1
	// this assumes little-endian though, big-endian is doable too
	@property bool isIntTag(){ return common.head & 1; }
	IntTagged intTagged;
	PtrTagged ptrTagged;
	CommonUnion common;
}
struct CommonUnion
{
	ubyte[size_of_max_builtin] store;
// this is where the type-tag starts - pointer or int
	uint head;
}

union IntTagged // int-tagged
{
	union{  // builtins go here
		int ival;
		double dval;
		// ....
	}
	uint tag;
}

union PtrTagged // ptr to typeinfo scheme
{
	ubyte[size_of_max_builtin] payload;
	TypeInfo* pinfo;
}

It's going to be challenging but I think I can pull off even nan-boxing 
with this scheme.

-- 
Dmitry Olshansky