D graph library

Joseph Rushton Wakeling joseph.wakeling at webdrake.net
Tue May 7 08:09:26 PDT 2013


On 05/06/2013 07:09 PM, H. S. Teoh wrote:
> I would be interested in something like this in D. Though in my case,
> I'm not so much interested in the graph data structures themselves, as
> the graph algorithms that apply to them, that can be used to solve
> various programming problems, like connectivity analysis of large data
> sets, reachability analysis, and the like.

Sure.  The data structures are a tool to let you implement those algorithms
really well ... :-)

> Graph coloring algorithms are
> also of some interest to me, since they underlie some of the
> visualization techniques that I'm interested in.

This would be very nice to have.  It's not really my direct area of interest,
but it would be lovely to have a nice visualization solution as part of the library.

> So what I envision as far as graph libraries in D are concerned is
> something like the generic range API, where there is a fixed convention
> as to what constitutes a graph/digraph/etc., and the library itself is a
> template library that is able to handle any data type that conforms to
> the requirements. Of course, this does not preclude specific,
> non-generic data structures used internally by said graph algorithms,
> but for maximal usability, the API shouldn't require the user to perform
> large-scale conversion of existing data structures into graph-specific
> structures just so he can run, say, a reachability analysis on it. When
> working with large data sets, one really wants to just map class X to
> nodes, class Y to edges, now give me the connectivity of the resulting
> graph (rather than looping over the existing data set to construct a
> separate graph representation thereof).

That's a nice way of looking at it.  Probably the basic graph interface would be
one which allows a few basic questions to be asked -- "give me a range of all
the nodes in the graph", "give me a range of all the edge pairs", etc., and
you'd want properties like isGraph, isDirectedGraph, isMultiGraph,
isBipartiteGraph, ...

... and similarly you'd want a node or edge interface which allows for similar
properties.

> I'm interested.

Thank you :-)

> Some years ago I implemented Tarjan's algorithm for computing
> strongly-connected components in C++. I needed to use it on an
> n-dimensional grid of cells, so I tried to make it generic, but due to
> limitations in C++ generic programming, I ended up just using an
> abstract class for representing the graph. This required implementing a
> separate graph class for every data structure I wanted to run the
> algorithm on, which is rather ugly. If something like this were done in
> D, I expect it would have a friendlier API by allowing Phobos-style
> genericity (using range-like abstraction for graph concepts like nodes
> and edges, alias for callbacks/closures to abstract away graph
> implementation details, etc.). One could, for example, allow overriding
> of methods that produce the SCC graph so that the result is produced in
> a custom target type that can be used directly by the application
> elsewhere.

Sounds fun! :-)

> This is a bit beyond what I need, but if we can have it, why not?

>From my point of view speed and space efficiency are essential because if I
continue down the current planned research direction, I'll probably wind up
working on graphs with hundreds of thousands of nodes and links.

> How would you go about doing this, though? Wouldn't it require some kind
> of wrapper API to make it compatible with the Python interpreter?

As Ellery suggested, PyD should be our friend here.  But I don't think
interfacing with Python is an essential short-term goal.

> I will be glad to have generic graph algorithms available in D, to
> minimize the need for reinventing the wheel. I've seen too much code
> that tries to reinvent known graphing algorithms (usually rather poorly)
> just because existing graphing libraries are too hard / too obscure to
> reuse.

Well, here's where I am right now -- I'm currently heading towards a writeup of
some fun stuff with colleagues, which will probably take a few weeks.  Assuming
that we are going to follow up on this, there's a reasonable case for doing this
kind of graph work.

So, I'll continue to read and think -- and we can continue to discuss here! --
and hopefully we can get towards some kind of action plan.

One remark on licensing.  I'm conscious of the fact that there is probably great
advantage (from a code point of view) in careful examination of the internals of
existing graph solutions.  On the other hand the most highly performing rival,
igraph, is GPL-licensed.  This obviously creates a potential problem where
(conscious or unconscious) copying or influence is concerned, because it would
probably be nice to continue the typical D licensing approach and use Boost.

Not that I have any personal objection to the GPL -- it's a license that I would
be inclined to choose for many projects -- but where such a generally useful
library is concerned, my inclination is to be permissive.

Anyone else have any thoughts on this?


More information about the Digitalmars-d mailing list