Structure of Arrays vs Array of Structures
Stanislav Blinov via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Mon May 15 08:12:40 PDT 2017
On Monday, 15 May 2017 at 06:50:04 UTC, Nicholas Wilson wrote:
> On Monday, 15 May 2017 at 06:44:53 UTC, Nordlöw wrote:
>> Have anybody done this already?
>
> Yes, https://maikklein.github.io/post/soa-d/
The code in that article is overly simplified. Concrete use cases
would require more than just storing one POD type. Looking back
at Jonathan Blow's example, what if you wanted to store this type
in a SoA?
>struct Entity {
> Vector3 position;
> Quaternion orientation;
>}
This, obviously, gets more involved, as now the SoA code needs to
flatten that whole type recursively, otherwise it'd just store
vectors packed together and then quaternions packed together, i.e:
[vx0, vy0, vz0, vx1, vy1, vz1...][qx0, qy0, qz0, qw0, qx1, qy1,
qz1, qw1...]
whereas it should store
[vx0, vx1...][vy0, vy1...][vz0, vz1...][qx0, qx1...][qy0,
qy1...][qz0, qz1...][qw0, qw1...]
Then, if you need to occasionally reassemble PODs from SoA, you
need to deal with member alignment. And then there's a matter of
providing actual functions dealing with types that are sparsely
stored.
Also, there is a... debatable assessment of viability there:
>But sometimes you still want to access all components of your
>data. An example would be a >vector. [...]
>Most operations will use all components anyways like add,
>subtract, dot, length and many >more. And even if you sometimes
>end up with
>
>struct Vec3{
> float x;
> float y;
> float z;
>}
>
>Array!Vec3 positions;
>
>positions[].filter!(v => v.x < 10.0f);
>
>and you want to filter all vectors where the x component is less
>than 10.0f, you will >still only load two additional floats.
"You will still *only* load two additional floats": that is
incorrect. The CPU (at least, x86/64) won't load "only" two
floats. It first needs to fetch the whole cache line only to
touch every third float in it. That's ~66% of access time wasted.
(Imagine how severe that is if it actually has to fetch from
RAM. Scratch that, don't imagine, time it.). And if you not only
read, but write data, it gets that much more wasteful. That is
precisely the problem SoA is intended to solve: it would always
visit 100% of data it needs to fetch. But of course, if you do
need to reconstruct tightly packed PODs from SoA, you'll get a
pretty severe penalty. So, once you commit to storing data this
way, you have to organize the code accordingly, otherwise you're
more likely to degrade performance rather than gain it.
I do have an ongoing code experiment to see how far D could take
us with it, but at this point it's rather immature. Perhaps in a
couple of weeks I'd get it to a state that I could publish, if
you guys would like to pitch in.
More information about the Digitalmars-d-learn
mailing list