SIMD support...

Sat Jan 7 16:14:27 PST 2012

On 1/7/2012 1:28 PM, Andrei Alexandrescu wrote:
>> Having a pluggable interface so the implementation can be changed is all
>> right, as long as the binary API does not change.
>> If the binary API changes, then of course, two different libraries
>> cannot be linked together. I strongly oppose any changes which would
>> lead to a balkanization of D libraries.
>
> In my opinion this statement is thoroughly wrong and backwards. I also think it
> reflects a misunderstanding of what my stance is. Allow me to clarify how I see
> the situation.
>
> Currently built-in hash table use generates special-cased calls to non-template
> functions implemented surreptitiously in druntime. The underlying theory, also
> sustained by the statement quoted above, is that we are interested in supporting
> linking together object files and libraries BUILT WITH DISTINCT MAJOR RELEASES
> OF DRUNTIME.
>
> There is zero interest for that. ZERO. No language even attempts to do so.
> Runtimes that are not compatible with their previous versions are common,
> frequent, and well understood as an issue.

We've agree on this before, perhaps I misstated it here, but I am not talking 
about changing druntime. I'm talking about someone providing their own hash 
table implementation that has a different binary API than the one in druntime, 
such that code from their library cannot be linked with any other code that uses 
the regular hashtable.

A different implementation of hashtable would be fine, as long as it is binary 
compatible. We did this when we switched from a binary tree collision resolution 
to a linear one, and the switchover went without a hitch because it did not 
require even a recompile of existing binaries.

> In an ideal world, built-in hash tables should work in a very simple manner. The
> compiler lowers all special hashtable syntax - in a manner that's MINIMAL,
> SIMPLE, and CLEAR - into D code that resolves to use of object.di (not some
> random user-defined library!). From then on, druntime code takes over. It could
> choose to use templates, dynamic type info, whatever. It's NOT the concern of
> the compiler. The compiler has NO BUSINESS taking library code and hardwiring it
> in for no good reason.

That was already true of the hashtables - it's just that the interface to them 
was through a set of fixed function calls, rather than a template interface. To 
the compiler, the hashtables were a completely opaque void*. The compiler had 
zero knowledge of how they actually were implemented inside the runtime.

Changing it to a template implementation enables a more efficient interface, as 
inlining, etc., can be done instead of the slow opApply() interface. The 
downside of that is it becomes a bit perilous, as the binary API is not so 
flexible anymore.

>> (Consider the disaster C++ has had forever with everyone inventing their
>> own string type. That insured zero interoperability between C++
>> libraries, a situation that persists even for 10 years after C++ finally
>> acquired a standard string library.)
>
> It is exactly this kind of canned statement and prejudice that we must avoid. It
> unfairly singles out C++ when there also exist incompatible libraries in C,
> Java, Python, you name it.

Of course, but strings are a fundamental data type, and so it was worse with 
C++. I don't agree that my opinion on it is prejudicial or unfair, because I 
many times was stuck with having to deal with the issues of trying to glue 
together disparate code that had differing string classes. Often, it was the 
only incompatibility, but it permeated the library interfaces.

 > Also, the last time the claim that everywhere invented their own string type
 > could have been credibly aired was around 2004.

Sure, people rarely (never?) do their own C++ string classes anymore, but that 
old code and those old libraries are still around, and are actively maintained.

http://msdn.microsoft.com/en-us/library/ms174288.aspx

Notice that's for Visual Studio C++ 2010.

The string problem was a mistake I was determined not to make with D.

I have agreed with you and still agree with the notion of using lowering instead 
of custom code. Also, keep in mind that the hashtable design was done long 
before D even had templates. It was "lowered" to what D had at the time - 
function calls and opApply.

> What's built inside the compiler is like axioms in math, and what's library is
> like theorems supported by the axioms. A good language, just like a good
> mathematical system, has few axioms and many theorems. That means the system is
> coherent and expressive. Hardwiring stuff in the language definition is almost
> always a failure of the expressive power of the language.

True.

> Sometimes it's fine to
> just admit it and hardwire inside the compiler e.g. the prior knowledge that "+"
> on int does modulo addition.

Right, I understand that the abstraction abilities of D are not good enough to 
produce a credible 'int' type, or 'float', etc., hence they are wired in.

> But most always it's NOT, and definitely not in the
> context of a complex data structure like a hash table. I also think that adding
> a hecatomb of built-in types and functions has smells, though to a good extent I
> concede to the necessity of it.

I want to reiterate that I don't think there is a way with the current compiler 
technology to make a library SIMD type that will perform as well as a builtin 
one, and those who use SIMD tend to be extremely demanding of performance.

(One could make a semantic equivalent, but not a performance equivalent.)

> We should start from what the user wants to accomplish. Then figure how to
> express that within the language. And only lastly, when needed, change the
> language to mandate lowering constructs to the MINIMUM EXTENT POSSIBLE into
> constructs that can be handled within the existing language. This approach has
> been immensely successful virtually whenever we applied it: foreach for ranges
> (though there's work left to do there), operator overloading, and too little
> with hashes. Lately I see a sort of getting lazy and skipping the second pass
> entirely. Need something? Yeah, what the hell, we'll put it in the language.

I don't think that is entirely fair in regards to the SIMD stuff. It reminds me 
of after I spent a couple years at Caltech, where every class was essentially a 
math class. My sister asked me for help with her high school trig homework, and 
I just glanced at it and wrote down all the answers. She said she was supposed 
to show the steps involved, but to me I was so used to doing it there was only 
one step.

So while it may seem I'm skipping steps with the SIMD, I have been thinking 
about it for years off and on, and I have a fair experience with what needs to 
be done to generate good code.

>
> I am a bit worried about the increasing radicalization of the discussion here,
> but recent statements come in frontal collision with my core principles, which I
> think stand on solid evidential ground. I am appealing for building consensus
> and staying principled instead of reaching for the cheap solution. If we do the
> latter, it's quite likely we'll regret it later.
>
>
> Andrei