D graph library

Fri May 10 05:04:28 PDT 2013

On 05/08/2013 06:12 PM, H. S. Teoh wrote:
> On Wed, May 08, 2013 at 02:34:32PM +0200, Joseph Rushton Wakeling wrote:
>> Finally, a reasonable question is whether they should return ranges of
>> actual nodes or edges, or ranges of node/edge IDs.
> 
> I think the distinction is mostly superfluous. One could always wrap IDs
> in a struct:

I'm being daft.  Of _course_ the ranges should be of nodes or edges, because you
want to be able to do things like

	foreach(n; g.nodes) {   // g is the graph ... :-)
		foreach(m; n.neighbours) {
			...
		}
	}

>> It should also be possible to define arbitrary collections of node
>> properties, such as colour, type, textual labels, numerical properties
>> etc., and to query if a node has them (preferably at compile-time).
> 
> Hmm. I'm not sure about requiring such attributes. I'd go for a more
> generic approach, such as if an algorithm requires some attribute (say
> weight), then it could take a compile-time parameter that maps .weight
> to whatever actual attribute is in the graph node / edge. For example:
> 
> 	auto findShortestPath(alias weight,G)(G graph, G.node start,
> 		G.node end) { ... }
> 
> 	struct Graph {
> 		struct Node { ... }
> 		alias node = Node;
> 		struct Edge {
> 			double length;
> 			@property double euclideanDistance();
> 			...
> 		}
> 	}
> 
> 	Graph g;
> 	Graph.Node x, y;
> 	auto pathByLength = findShortestPath!"length"(g, x, y);
> 	auto pathByDist = findShortestPath!"euclideanDistance"(g, x, y);
> 
> This allows the same graph to be evaluated multiple ways, as this
> example shows, without needing to create wrappers or proxy objects.

I don't think what I was proposing was at odds with what you describe here.  For
sure it's helpful to be able to map attributes as you describe here.  What I
mean is simply it's helpful to be able to define arbitrary properties (and
property names) that will be associated with edges, not that edges should be
_obliged_ to have a certain set of associated parameters.

> I'm tentatively considering defining the node type within the graph
> itself (G.node above), as a way of reducing the number of compile-time
> parameters, but this may not be necessary, and may be limiting in some
> cases. Have to think about it a bit more.

Can you expand on this?

> I think it's better if each algorithm has a specific set of properties
> they expect to be present, and use compile-time parameters to map them
> onto concrete graph properties.

Again, I don't think this is in contradiction with what I'm suggesting -- of
course individual algorithms should expect and check for certain parameters, and
compile-time parameters can be used to map them to the algorithm.

> Also, I'm not sure about isWeighted... I would expect most graph
> algorithms to either not care about weights (in which case the entire
> graph can be weightless) or require weights on all edges (in which case
> isWeighted is redundant). Or am I ignorant of algorithms that process
> weights that only exist in some edges but not others?

Consider path length -- e.g. a path v1 -> v2 -> v3 has length 2 for an
unweighted graph, length w_{1, 2} + w_{2, 3} for a weighted graph.  That in turn
has implications for shortest-path-length calculations, and the optimal
algorithm to calculate shortest paths thus depends on whether you're dealing
with the weighted or unweighted case.

All of this in turn impacts on any metrics or algorithms that depend on
distances between nodes.  For example, today I'm going to be implementing [*] an
algorithm to calculate betweenness centrality:
http://www.cs.ucc.ie/~rb4/resources/Brandes.pdf  This does need to be modified,
albeit trivially, for the weighted case.

[* Inside some custom-written code for a very particular application, I'm
afraid.  This is why I want to create a general-purpose graph library, to stop
me having to do things like this :-P ]

> I think it would help discussion if we have a list (even if tentative)
> of algorithms we'd like to implement, along with what each algorithm
> expects from the graph (weights, colors, directedness, etc.). Then we
> can evaluate which properties/functions/etc. are really necessary, and
> which are just icing on the cake that can be omitted with no real
> drawbacks. I think the more minimal the API the better -- it makes the
> resulting module easier to understand and use, and makes it more likely
> we'll actually finish implementing it. :-P

Agree about minimal API, but I think what I've described previously is pretty
minimal ... :-)

As for algorithms, igraph has a very substantial list:
http://igraph.sourceforge.net/doc/html/igraph-Structural.html

We could probably "scratch our own itch" to a degree (ha!) in terms of choosing
what to implement, but stuff like shortest path calculation, centrality measures
and so on seem like good starting points.

> Anyway, some further thoughts: we could have graph adaptors in the same
> vein as the range constructors in std.range, that allows one to wrap a
> graph in another form. E.g., transposeGraph(G) that returns a graph with
> the sense of edges reversed, etc.

Yes, that would be nice. :-)