Optimize my code =)

bearophile bearophileHUGS at lycos.com
Fri Feb 14 08:55:03 PST 2014


Robin:

> class Matrix(T = double) {
> 	private T[] data;
> 	private Dimension dim;
> }

Also try "final class" or struct in your benchmark. And try to 
use ldc2 compiler for performance benchmarks.

Perhaps dim is better const, unless you want to change the shape 
of the matrix.


> this(size_t rows, size_t cols) {
> 	this.dim = Dimension(rows, cols);
> 	this.data = new T[this.dim.size];
> 	enum nil = to!T(0);
> 	foreach(ref T element; this.data) element = nil;
> }

Better:

this(in size_t rows, in size_t cols) pure nothrow {
     this.dim = Dimension(rows, cols);
     this.data = 
minimallyInitializedArray!(typeof(data))(this.dim.size);
     this.data[] = to!T(0);
}


> I experienced that floating point values are sadly initialized 
> with nan which isn't what I wanted -> double.init = nan.

This is actually a good feature of D language :-)


> T opIndex(size_t row, size_t col) const {
> 	immutable size_t i = this.dim.offset(row, col);
> 	if (i >= this.dim.size) {
> 		// TODO - have to learn exception handling in D first. :P
> 	}

Look in D docs for the contract programming.

Also add the pure nothrow attributes.


> Which calls:
>
> size_t size() @property const {
> 	return this.rows * this.cols;
> }
>
> I think and hope that this is getting optimized via inlining. =)

Currently class methods are virtual, and currently D compilers 
are not inlining virtual calls much. The solution is to use a 
final class, final methods, etc.


> This works similar for opIndexAssign.
>
> The function I am mainly benchmarking is the simple matrix 
> multiplication where one of the multiplied matrices is 
> tranposed first in order to improve cache hit ratio.

Transposing the whole matrix before the multiplication is not 
efficient. Better to do it more lazily.


> 	auto m = new Matrix(this.dim.rows, other.dim.cols);
> 	auto s = new Matrix(other).transposeAssign();
> 	size_t r1, r2, c;
> 	T sum;
> 	for (r1 = 0; r1 < this.dim.rows; ++r1) {
> 		for (r2 = 0; r2 < other.dim.rows; ++r2) {

Better to define r1, r2 inside the for. Or often even better to 
use a foreach with interval.


> D compiled with DMD takes about 14 seconds with all 
> (known-to-me) optimize flag activated. (listed above)

DMD is not efficient with floating point values. Try ldc2 or gdc 
compilers (I suggest ldc2).


> I wanted to make s an immutable matrix in order to hopefully 
> improve performance via this change,

Most times immutability worsens performance, unless you are 
passing data across threads.


> however I wasn't technically able to do this to be honest.

It's not too much hard to use immutability in D.


> Besides that I can't find a way how to make a use of move 
> semantics like in C++. For example a rvalue-move-constructor or 
> a move-assign would be a very nice thing for many tasks.

That adds too much complexity to .


> Another nice thing to know would be if it is possible to 
> initialize an array before it is default initialized with 
> T.init where T is the type of the array's fields. In C++ e.g. 
> there is no default initialization which is nice if you have to 
> initialize every single field anyway. E.g. in a Matrix.random() 
> method which creates a matrix with random values. There it is 
> unnecessary that the (sometimes huge) array is completely 
> initialized with the type's init value.

It's done by the function that I have used above, from the 
std.array module.

Bye,
bearophile


More information about the Digitalmars-d-learn mailing list