Small Vectors Proposal
Chad J
gamerChad at _spamIsBad_gmail.com
Thu Feb 1 16:27:52 PST 2007
janderson wrote:
> Mikola Lysenko wrote:
>
>> Benji Smith wrote:
>>
>>> Right now, I'm working on a statistical prediction algorithm that
>>> uses *sparse* 300,000-dimensional feature vectors. There's no reason
>>> to limit SSE optimizations to small-dimensionality vectors.
>>>
>>
>> I can understand your desire to create a universal solution, and for a
>> long time I too was convinced this was the way to go. However I have
>> come to understand that there are many practical differences between
>> low and high dimension vectors. Think about it this way: would you
>> try to use a sparse matrix/vector library for doing operations on 4-d
>> vectors and matrices? It should be clear that these are two different
>> very different types of problems.
>>
>> If we tried to use pass-by-value semantics and array operations on
>> gigantic vectors the result would be equally bad. Obviously passing a
>> 300k+ element array around by value is going to destroy your
>> performance. The classic solution is to create a series of custom
>> routines for whichever problem you are trying to solve, or perhaps use
>> a robust optimized library like BLAS for the heavy lifting.
>>
>> High dimension linear algebra is a tricky subject, and it can take
>> years of research to get algorithms with competitive performance. On
>> the other hand, Low dimension vectors require relatively trivial
>> compiler support, and are used far more often. To get a rough
>> estimate of their relative popularity, I invoke Google CodeSearch:
>>
>> vector math -sparse : 282k hits
>> http://www.google.com/codesearch?q=vector+math+-sparse&hl=en&btnG=Search+Code
>>
>>
>> vector sparse : 31k hits
>> http://www.google.com/codesearch?hl=en&lr=&q=vector+sparse&btnG=Search
>>
>> From this informal statistic, we could infer that the low dimensional
>> vectors are used at least an order of magnitude more frequently than
>> their higher order counterparts.
>>
>> I do not mean to say that fast sparse vector/matrix operations are
>> unimportant, after all a fast matrix solver can sometimes shave weeks
>> off your processing time. In such a situation, it is permissible for
>> a library to be slightly cumbersome - just as long as it gets the job
>> done fast.
>>
>> Compare this to small vectors, where many applications place
>> convenience and readability above performance. Integrating small
>> vectors into the core language will not only improve their usability,
>> but it should also provide substantial performance gains.
>
>
> I don't think its critical that high order matrix multiplications get
> the same performance gains as smaller arrays that use the SIMD. Its
> just important that its consistent.
>
> Putting vector operation into an array gives us more power because we
> get all the standard array operations as well (I concede that these
> operations could be put into the base arrays as well).
>
> I hope that eventually that dynamic arrays would get the SIMD support as
> well. That way you could for instance load an entire list of vertexes
> (or whatever) and apply an operation on all of them. D could even do
> this in parallel. (I concede here that you could probably do that with
> an array of these base types. Also vectors with 4 parts can be treated
> somewhat differently in that the 4th component is sometimes used for
> other reasons).
>
> I generally think that representing vector types as arrays makes the
> language more powerful, then having individual special case types.
>
> -Joel
I think we should have array operations for dynamic arrays
(n-dimensional vectors) as well, but distinct from lo-D static arrays.
The reason is the same as others state: hi-D arrays can spend some fixed
overhead setting things up before jumping into a gigantic loop, whereas
lo-D arrays get optimized much differently with no fixed overhead and
possibly higher running overhead or less generality. They are two
different beasts.
Since hi-D arrays can deal with some overhead at startup, they are ideal
candidates for a library implementation of array operations. There are
only one or two features D would need to get there and have this
seamless library implementation (maybe they are done already?? I
forget), and that may be an easier and more general route than trying to
make compiler writers (Walter) write compiler support for both lo-D and
hi-D optimization cases.
More information about the Digitalmars-d
mailing list