Profiling graph library

Joseph Rushton Wakeling joseph.wakeling at webdrake.net
Thu Aug 1 03:40:40 PDT 2013


On 07/31/2013 02:31 PM, bearophile wrote:
> If this resulting array is used only for a short period of time:
> 
> return sequence.map!(x => x.func).array;

Just to compare, I tried rewriting the cacheing version of the neighbours()
function to use this kind of sequence.  Here's the original code:
https://github.com/WebDrake/Dgraph/blob/cache/dgraph/graph.d#L358-L379

... and here's the rewritten version:

    auto neighbours(immutable size_t v)
    {
        if (!_cacheNeighbours[v])
        {
            iota(_sumTail[v], _sumTail[v + 1])
                .map!(a => _head[_indexTail[a]])
                .copy(_neighbours[_sumTail[v] + _sumHead[v] .. _sumHead[v] +
_sumTail[v + 1]]);
            iota(_sumHead[v], _sumHead[v + 1])
                .map!(a => _tail[_indexHead[a]])
                .copy(_neighbours[_sumHead[v] + _sumTail[v + 1] .. _sumTail[v +
1] + _sumHead[v + 1]]);
            _cacheNeighbours[v] = true;
        }
        return _neighbours[_sumTail[v] + _sumHead[v] .. _sumTail[v + 1] +
_sumHead[v + 1]];
    }

Although it's faster than the non-cached map-based version, there's still a
substantial performance hit from a large number of small allocations.  So, your
critiques of component programming certainly have merit on the performance side.

What's weird is that even though the graph is always passed around by ref,
meaning that the neighbours are calculated only once, according to callgrind the
number of calls to _d_allocmemory still amounts to no. of vertices * no. of
trial runs.

And yes, I put in some checks to make sure that the loop was only being called
the correct number of times.


More information about the Digitalmars-d-learn mailing list