Optimize my code =)
bearophile
bearophileHUGS at lycos.com
Mon Feb 17 14:43:55 PST 2014
Robin:
> The key to victory was pointer arithmetics as I notices that I
> have used them in the C++ implementation, too. xD
Perhaps with LDC2 it's not necessary.
> I will just post the whole code again so that you can see what
> I have changed.
The code looks better.
There's no need to put Dimension in another module. In D modules
contain related stuff, unlike Java.
Also feel free to use some free-standing functions. With UFCS
they get used in the same way, and they help making
classes/structs simpler.
Some of your imports could be moved to more local scopes, instead
of being all at module-level.
> result ~= to!string(this[r, c]);
=>
result ~= this[r, c].text;
> writeln("\tTime required: " ~ to!string(secs) ~ " secs, " ~
> to!string(msecs) ~ " msecs");
=>
writeln("\tTime required: ", secs, " secs, ", msecs, " msecs");
> ref Matrix opOpAssign(string s)(in T scalar) pure nothrow if
> (s == "*") {
=>
ref Matrix opOpAssign(string op)(in T scalar) pure nothrow if
(op == "*") {
Also you have two functions with code like:
this.data[] op= scalar;
You can define a single template (untested):
/**
* Adds or subtracts all entries of this matrix by the given
other matrix.
*/
ref Matrix opOpAssign(string op)(const ref Matrix other) pure
nothrow
if (op == "+" || op == "-")
in
{
assert(this.dim == other.dim);
}
body
{
mixin("this.data[] " ~ op ~ "= other.data[];");
return this;
}
> Matrix opBinary(string s)(const ref Matrix other) const pure
> if (s == "*")
Given that on a 32 bit system a Matrix is just 16 bytes, it could
be better to not accept the argument by ref, and avoid one more
level of indirection:
Matrix opBinary(string s)(in Matrix other) const pure if (s
== "*")
> However, there were some strange things of which I am very
> confused ...
>
> void allocationTest() {
> writeln("allocationTest ...");
> sw.start();
> auto m1 = Matrix!double(10000, 10000);
> { auto m2 = Matrix!double(10000, 10000); }
> { auto m2 = Matrix!double(10000, 10000); }
> { auto m2 = Matrix!double(10000, 10000); }
> //{ auto m2 = Matrix!double(10000, 10000); }
> sw.stop();
> printBenchmarks();
> }
>
> This is the most confusing code snippet. I have just changed
> the whole allocation for all m1 and m2 from new Matrix!double
> (on heap) to Matrix!double (on stack)
The matrix data is always on the heap. What ends on the stack is
a very limited amount of stuff.
> This is extremely confusion as I allocate these matrices on the
> stack and since I have allocated them within their own
> scoped-block they should instantly release their memory
You are mistaken. minimallyInitializedArray allocates memory on
the GC heap (and there's not enough space on the stack to
allocate 10000^2 doubles). In both D and Java the deallocation of
memory managed by the GC is not deterministic, so it's not
immediately released at scope exit. Also unlike the Java GC, the
D GC is less refined, and by design it is currently not precise.
With such large arrays often there are _inbound_ pointers that
keep the memory alive, especially on 32 bit systems. So perhaps
your problems are caused by the GC.
You can have deterministic memory management in your struct-based
matrices, but you have to allocate the memory manually (from the
GC or probably better from the C heap) and free it in the struct
destructor usign RAII.
> Another strange things was that the new opEquals implementation:
>
> bool opEquals(const ref Matrix other) const pure nothrow {
> if (this.dim != other.dim) {
> return false;
> }
> foreach (immutable i; 0 .. this.dim.size) {
> if (this.data[i] != other.data[i]) return false;
> }
> return true;
> }
>
> is actually about 20% faster than the one you have suggested.
> With the single line of "return (this.dim == other.dim &&
> this.data[] == other.data[]).
I think this small performance bug is fixed in dmd 2.065 that is
currently in beta3.
> The last thing I haven't quite understood is that I tried to
> replace
>
> auto t = Matrix(other).transposeAssign();
>
> in the matrix multiplication algorithm with its shorter and
> clearer form
>
> auto t = other.transpose(); // sorry for the nasty '()', but I
> like them! :/
>
> This however gave me wonderful segmentation faults on runtime
> while using the matrix multiplication ...
This looks like a null-related bug.
I'll benchmark your code a little, but I think I don't have as
much RAM as you.
Bye,
bearophile
More information about the Digitalmars-d-learn
mailing list