Requesting Help with Optimizing Code

Thu Apr 8 01:24:23 UTC 2021

Hello all. I have been working on learning computational 
photography and have been using D to do that. I recently got some 
code running that performs [chromatic 
adaptation](https://en.wikipedia.org/wiki/Chromatic_adaptation) 
(white balancing). The output is still not ideal (image is 
overexposed) but it does correct color casts. The issue I have is 
with performance. With a few optimizations found with profiling I 
have been able to drop processing time from ~10.8 to ~6.2 seconds 
for a 16 megapixel test image. That still feels like too long 
however. Image editing programs are usually much faster.

The optimizations that I've implemented:
* Remove `immutable` from constants. The type mismatch between 
constants (`immutable(double)`) and pixel values (`double`) 
caused time-consuming checks for compatible types in mir 
operations and triggered run-time type conversions and memory 
allocations (sorry if I butchered this description).
* Use `mir.math.common.pow` in place of `std.math.pow`.
* Use `@optmath` for linearization functions 
(https://github.com/kyleingraham/photog/blob/up-chromadapt-perf/source/photog/color.d#L192 and https://github.com/kyleingraham/photog/blob/up-chromadapt-perf/source/photog/color.d#L318).

Is there anything else I can do to improve performance?

I tested the code under the following conditions:
* Compiled with `dub build --build=release --compiler=ldmd2`
* dub v1.23.0, ldc v1.24.0
* Intel Xeon W-2170B 2.5GHz (4.3GHz turbo)
* [Test 
image](https://user-images.githubusercontent.com/25495787/113943277-52054180-97d0-11eb-82be-934cf3d22112.jpg)
* Test code:
```d
#!/usr/bin/env dub
/+ dub.sdl:
     name "photog-test"
     dependency "photog" version="~>0.1.1-alpha"
     dependency "jpeg-turbod" version="~>0.2.0"
+/

import std.datetime.stopwatch : AutoStart, StopWatch;
import std.file : read, write;
import std.stdio : writeln, writefln;

import jpeg_turbod;
import mir.ndslice : reshape, sliced;

import photog.color : chromAdapt, Illuminant, rgb2Xyz;
import photog.utils : imageMean, toFloating, toUnsigned;

void main()
{
     const auto jpegFile = "image-in.jpg";
     auto jpegInput = cast(ubyte[]) jpegFile.read;

     auto dc = new Decompressor();
     ubyte[] pixels;
     int width, height;
     bool decompressed = dc.decompress(jpegInput, pixels, width, 
height);

     if (!decompressed)
     {
         dc.errorInfo.writeln;
         return;
     }

     auto image = pixels.sliced(height, width, 3).toFloating;

     int err;
     double[] srcIlluminant = image
         .imageMean
         .reshape([1, 1, 3], err)
         .rgb2Xyz
         .field;
     assert(err == 0);

     auto sw = StopWatch(AutoStart.no);

     sw.start;
     auto ca = chromAdapt(image, srcIlluminant, 
Illuminant.d65).toUnsigned;
     sw.stop;

     auto timeTaken = sw.peek.split!("seconds", "msecs");
     writefln("%d.%d seconds", timeTaken.seconds, timeTaken.msecs);

     auto c = new Compressor();
     ubyte[] jpegOutput;
     bool compressed = c.compress(ca.field, jpegOutput, width, 
height, 90);

     if (!compressed)
     {
         c.errorInfo.writeln;
         return;
     }

     "image-out.jpg".write(jpegOutput);
}
```

Functions found through profiling to be taking most time:
* Chromatic adaptation: 
https://github.com/kyleingraham/photog/blob/up-chromadapt-perf/source/photog/color.d#L354
* RGB to XYZ: 
https://github.com/kyleingraham/photog/blob/up-chromadapt-perf/source/photog/color.d#L142
* XYZ to RGB: 
https://github.com/kyleingraham/photog/blob/up-chromadapt-perf/source/photog/color.d#L268

A profile for the test code is 
[here](https://github.com/kyleingraham/photog/files/6274974/trace.zip). The trace.log.dot file can be viewed with xdot. The PDF version is [here](https://github.com/kyleingraham/photog/files/6275358/trace.log.pdf). The profile was generated using:

* Compiled with dub build --build=profile --compiler=ldmd2
* Visualized with profdump - dub run profdump -- -f -d -t 0.1 
trace.log trace.log.dot

The branch containing the optimized code is here: 
https://github.com/kyleingraham/photog/tree/up-chromadapt-perf
The corresponding release is here: 
https://github.com/kyleingraham/photog/releases/tag/v0.1.1-alpha

If you've gotten this far thank you so much for reading. I hope 
there's enough information here to ease thinking about 
optimizations.