D graph library

Sat May 11 10:08:04 PDT 2013

On Sat, May 11, 2013 at 03:18:16PM +0200, Joseph Rushton Wakeling wrote:
> On 05/10/2013 07:52 PM, H. S. Teoh wrote:
[...]
> > In fact, now that I think of it... most of the features you listed
> > can arguably be optional: only a few algorithms (none that I know
> > of, actually) require enumeration of edges, some algorithms may only
> > require an input range of nodes, each of which gives an input range
> > of edges, etc.. Would it make sense to define a set of graph
> > properties (e.g., via templates ala
> > std.range.is{Input,Forward,Bidirectional,...}Range, or hasLength,
> > hasSwappableElements, etc.), and have each algorithm require
> > whatever minimal subset of them is necessary to do the work? Because
> > it makes little sense to require, say, a graph that returns a total
> > count of nodes, if a given algorithm never needs to use that
> > information anyway. This way, a structure for which the total number
> > of nodes is expensive to compute can still be used with that
> > algorithm.
> 
> That's a fair point.  I don't know that I like the idea of defining
> all those different kinds of checks for different graph attributes,
> though.

Well, maybe what we can do is to make a list of attributes that are
needed by various graph algorithms, and see if there's a way to distill
them into a small set of attributes without too many scattered attribute
checks. For example, if we can categorize graphs into a small set of
types (ala input range, forward range, etc.) plus a small number of
optoinal attributes, that would be ideal, I think.

> > Not if the graph is computed on-the-fly. E.g. chess analysis
> > applications in which the complete graph is infeasible to compute.
> 
> Well, it ought still to be possible to define a range -- albeit, not a
> random-access one -- that covers all the edges in that graph.  It's
> just that it'll take forever to finish.
> 
> Agree though that this is a case where an edges() property is probably
> not desirable.

Are there actually algorithms that depend on having an .edges property?
AFAIK, most iterate over nodes in some way. I can see *nodes* having an
.edges property, but I'm not sure about the utility of enumerating edges
over the entire graph. Or am I missing something obvious?

> > In that case, I wonder if it's even necessary to unify the two graph
> > types. Why not just have two separate types and have the type system
> > sort out compatibility with algorithms for us?
> 
> That might be a possible solution, but whether directed or undirected
> (or mixed), a graph will broadly have the same general public
> interface.

True.

> > Are there any algorithms that need to do something different
> > depending on whether the input graph is bipartite or not? Just
> > wondering if it's worth the trouble of introducing an additional
> > field (it may well be necessary -- I admit my knowledge of graph
> > algorithms is rather spotty).
> 
> Put it this way: sometimes, you might want to check if a graph _is_
> bipartite, that is, if its nodes can be divided into two disjoint sets
> A and B such that links in the graph are always between nodes in the
> different sets, never between nodes from the same set.  That could be
> a runtime test that you could apply to any graph.
> 
> Then again, you might deliberately want to construct an explicitly
> bipartite graph, and you might want to have checks and balances in
> place to make sure that you don't accidentally add nodes between nodes
> in the same set.

Hmm. Maybe something similar to std.range.SortedRange would be useful:

	struct BipartiteGraph(G) {
		enum isBipartite = true;
		G impl;
		alias impl this;
	}

	BipartiteGraph!G verifyIsBipartite(G)(G graph) {
		// verifies bipartiteness of G and returns graph wrapped
		// in a BipartiteGraph!G.
	}

	auto createBipartiteGraph(T...)(T graphData) {
		// return an instantiation of BipartiteGraph that's
		// guaranteed to be bipartite by construction.
	}

	auto someBipartiteAlgo(G)(BipartiteGraph!G graph) {
		// freely assume bipartiteness of graph
	}

This way we can make use of the type system to statically ensure
bipartiteness, while still allowing runtime verifications of arbitrary
graphs.

> > Right. Which brings me back to my idea that perhaps these graph
> > properties should be optional, or perhaps we should have some kind
> > of hierarchy of graph types (ala input range, forward range,
> > bidirectional range, etc.), so that individual algorithms can choose
> > which features are mandatory and which are optional.
> > 
> > (I really dislike algorithms that are artificially restricted just
> > because the author decided that feature X is required, no matter if
> > the algorithm doesn't even use it.)
> 
> Fair point.  _Algorithms_ should require the minimal number of
> constraints.  Most of my suggestions for range types etc. represent
> what I think is useful for a general-purpose graph data type rather
> than what is absolutely necessary for individual specialized
> applications.

Aha. I think we found the source of contention here. You're thinking
about concrete graph types that can be used for general graph
applications, whereas I'm thinking of Phobos-style generic graph
algorithm *templates* that can process any graph-like type satisfying
certain requirements.

In that light, I think it makes sense to provide a set of concrete graph
types that people can use (e.g., an adjacency list type, an incidence
list type, matrix representations maybe, to list a few commonly-used
graph representations), but the algorithms themselves will be generic
templates that can work with any concrete type as long as it has the
attributes required by the algorithm. The concrete graph types (which
may themselves be templates, e.g. to allow arbitary node types or edge
labels) will of course satisfy (most of) these requirements by default,
so they can be immediately used with the algorithms without further ado.
At the same time, the user can decide to create their own graph types,
and as long as the target algorithm's prerequisites are satisfied, it
can be used with those custom types.

> Must dash again, so will catch up on your other points later ... ! :-)

Actually, I'll be on vacation for a week and a half, with probably
spotty internet connectivity, so no need to hurry. :)

--T