Why 16Mib static array size limit?

Tue Aug 16 11:46:06 PDT 2016

On 08/16/2016 10:51 AM, Johan Engelen wrote:
 > On Tuesday, 16 August 2016 at 01:28:05 UTC, Ali Çehreli wrote:
 >>
 >> With ldc2, the best option is to go with a dynamic array ONLY IF you
 >> access the elements through the .ptr property. As seen in the last
 >> result, using the [] operator on the array is about 4 times slower
 >> than that.
 >
 > As Yuxuan Shui mentioned the difference is in vectorization. The
 > non-POINTER version is not vectorized because the semantics of the code
 > is not the same as the POINTER version. Indexing `arr`, and writing to
 > that address could change `arr.ptr`, and so the loop would do something
 > different when "caching" `arr.ptr` in `p` (POINTER version) versus the
 > case without caching (non-POINTER version).
 >
 > Evil code demonstrating the problem:
 > ```
 > ubyte evil;
 > ubyte[] arr;
 >
 > void doEvil() {
 >     // TODO: use this in the obfuscated-D contest
 >     arr = (&evil)[0..50];
 > }
 > ```
 >
 > The compiler somehow has to prove that `arr[i]` will never point to
 > `arr.ptr` (it's called Alias Analysis in LLVM).
 >
 > Perhaps it is UB in D to have `arr[i]` ever point into `arr` itself, I
 > don't know. If so, the code is vectorizable and we can try to make it so.
 >
 > -Johan

Thank you all. That makes sense... Agreeing that the POINTER version is 
applicable only in some cases, looking only at the non-POINTER cases, 
for ldc2, a static array is faster, making the "arbitrary" 16MiB limit a 
performance issue. For ldc2, static array is about 40% faster:

6) ldc2 deneme.d -ofdeneme  -O5 -release -boundscheck=off -d-version=STATIC

   0.472s

8) ldc2 deneme.d -ofdeneme  -O5 -release -boundscheck=off

   0.792s

It's the opposite for dmd:

2) dmd deneme.d -ofdeneme -O -boundscheck=off -inline -version=STATIC

    4.238s

4) dmd deneme.d -ofdeneme -O -boundscheck=off -inline

    3.845s

Ali