Mir vs. Numpy: Reworked!
9il
ilyayaroshenko at gmail.com
Sat Dec 5 08:08:21 UTC 2020
On Friday, 4 December 2020 at 20:26:17 UTC, data pulverizer wrote:
> On Friday, 4 December 2020 at 14:48:32 UTC, jmh530 wrote:
>>
>> It looks like all the `sweep_XXX` functions are only defined
>> for contiguous slices, as that would be the default if define
>> a Slice!(T, N).
>>
>> How the functions access the data is a big difference. If you
>> compare the `sweep_field` version with the `sweep_naive`
>> version, the `sweep_field` function is able to access through
>> one index, whereas the `sweep_naive` function has to use two
>> in the 2d version and 3 in the 3d version.
>>
>> Also, the main difference in the NDSlice version is that it
>> uses *built-in* MIR functionality, like how `sweep_ndslice`
>> uses the `each` function from MIR, whereas `sweep_field` uses
>> a for loop. I think this is partially to show that the
>> built-in MIR functionality is as fast as if you tried to do it
>> with a for loop yourself.
>
> I see, looking at some of the code, field case is literally
> doing the indexing calculation right there. I guess ndslice is
> doing the same thing just with "Mir magic" an in the background?
sweep_ndslice uses (2*N - 1) arrays to index U, this allows LDC
to unroll the loop.
More details here
https://forum.dlang.org/post/qejwviqovawnuniuagtd@forum.dlang.org
> I'm still not sure why slice is so slow. Doesn't that
> completely rely on the opSlice implementations? The choice of
> indexing method and underlying data structure?
sweep_slice is slower because it iterates data in few loops
rather than in a single one. For small matrices this makes
JMP/FLOP ratio higher, for large matrices that can't feet into
the CPU cache, it is less memory efficient.
More information about the Digitalmars-d-announce
mailing list