D graph library -- update

Fri Jul 12 01:04:32 PDT 2013

On 07/11/2013 08:14 PM, w0rp wrote:
> 1. You can only use a size_t type as the vertex type. What about strings?

I want to reply at greater length on this point, because I don't want you to
feel I'm dismissing your concerns.  I remember that back in our earlier
discussions you posted a basic data type of something like,

    WeightType[VertexID][VertexID];

where VertexID might be an integer, a string, or potentially something else.

Let's simplify and suppose we have bool[VertexID][VertexID], so that it's just
true of there's a link, false otherwise.  In principle you could simplify still
further and just have a set of VertexID pairs, except that Phobos still doesn't
have a decent Set implementation (... hint, hint ... :-)

This will be quick (and fairly memory efficient, I'd have thought) to create,
and the distinction between Set and MultiSet would allow an easy constraint on
whether you have a graph or a multigraph.

You also have a ready and efficient way to check if a given link (head, tail) is
in the graph, although I think I can probably do a good job on that with the
current implementation (more on that another time:-).

What I'm not sure that you have so readily is an efficient way to extract e.g.
degree(v), or neighbours(v), and I'm concerned that the extra data you'll have
to carry to change this may actually lead to a memory blow-up.

I'd be very happy to be proved wrong about this, so if you have some ideas, or
better still code for comparison and benchmarking, I'll happily reconsider the
design.

But for now my contention is that generic vertex IDs are only relevant when it
comes to input or output, and that's where they should be implemented, not in
the core processing code.

> 3. I have to control the graph's capacity manually, and I can't use arbitrary
> numbers for edges.

I don't see the capacity issue as being different from e.g. the fact that an
array's capacity must be controlled manually unless you explicitly attempt to
append to the end of it.

As for arbitrary numbers for edges: I think in practice this should be OK, so
long as you ensure that vertexCount is greater than the maximum vertex ID
number, and there are not _too_ many gaps between the vertex IDs that are used.
 (e.g. if your vertex IDs are 1 to 50 rather than 0 to 49, you should have no
problem.)  Then, most results can be adapted simply by filtering out "nodes"
with degree == 0.

If on the other hand your node IDs are 10, 20, 30, ... you might not get such a
good performance ... :-)

> 4. Adding an edge doesn't add vertices.

Here I don't think I have anything to add apart from that I don't like the idea
of a graph that might surreptitiously add an arbitrary number of vertices simply
on the basis of an edge being added.  I'd rather force the user to check that
the link being added is valid given the vertices available, and provide a
wrapper to offer those kinds of checks and the opportunity to expand the number
of vertices if needed.