std.variant benchmark

Mon Jul 30 01:34:18 PDT 2012

On 30-Jul-12 06:01, Andrei Alexandrescu wrote:
>> In fact memcpy could and should be replaced with word by word copy for
>> almost all of struct sizes up to ~32 bytes (as size is known in advance
>> for this particular function pointer i.e. handler!int).
>
> In fact memcpy should be smart enough to do all that, but apparently it
> doesn't.
>

I'd say array ops could and should do this (since compiler has all the 
info at compile-time). On the other hand memcpy is just one tired C 
function aimed at fast blitting of any memory chunks.
(Even just   call/ret pair is too much of overhead in case of int).

>> I've found something far more evil:
> [snip]
>> So to boot in order to get to int + int _detected_ it needs 4 calls to
>> fptr(OpID.testConversion, null, &info) == 0 and acquire TypeInfo each
>> time. Not counting 2 gets to obtain result. It's just madness.
>
> Heh, this all gives me a new, humbling perspective on my earlier self
> running my mouth about other designs :o). As always, history and context
> makes one much more empathic.
>
> Variant has a very old design and implementation. It is, in fact, the
> very first design I've ever carried in D; I'd decided, if I can't do
> that, D isn't powerful enough. The compiler was so buggy back then,
> implementation needed to take many contortions, and it was a minor
> miracle the whole house of cards stood together.
>

These were indeed very rough days. I think it's a minor miracle I 
haven't gave up on D after seeing tons of bugs back in 2010.

> The basic approach was to use a pointer to function (instead of an
> integral) as a discriminator. The pointer to function was taken from the
> instantiation of a template function. That means Variant could work with
> any type, even types not yet defined. I'd made it a clear goal that I
> must not enumerate all types in one place, or "register" types for use
> with Variant - that would have been failure. Using a pointer to template
> function naturally associated a distinct tag with each type,
> automatically. I think it was a good idea then, and it's a good idea now.

It's indeed nice trick and design sounds great (no register function, yay!).
The only piece missing is that e.g. arithmetic operations are binary 
thus requiring double-dispatch or something like 
arithmeticHandler!(T1,T2) to archive optimal performance.

The virtual handcrafted C competition would do it via nested switches.

[snip]
>>> We can use the pointer to function as a tag for improving performance on
>>> primitive types.
>>>
>>
>> Indeed we can though it won't be as fast as simple tag - pointers
>> have quite big and sparse addresses giving us the same as:
>>
>> if(ptr == handler!int)
>> ....
>> else if(ptr == handler!uint)
>> ....
>> else
>> ....
>
> Yes, and what we gain is speed for select known types. As long as the
> number of those is relatively small, using simple test and branch should
> serve us well. And we get to keep the generality, too.
>

OK I'll try it for opArithmetic that should help a lot.

>> Another way to go is make 2-d function table for all built-ins and all
>> operations on them. It's in fact a bunch of tables:
>> Type1 x Type2 for all of important basic operations. Tables are static
>> so it won't be much of a problem. At least int + int then would be
>> 1 memory read, 1 call, 2 (casted as needed) reads & 1 store.
>
> I think that's not worth it, but hey, worth a try.
>

I'm thinking more of it and common types are just uint, int, long, 
ulong, double, string. In fact most of e.g. int x string  and such are 
"throw me an exception" entries, they could be merged.
Then we could make 2-stage table to save some space, it sounds like 
trie, hm....

-- 
Dmitry Olshansky