D graph library

H. S. Teoh hsteoh at quickfur.ath.cx
Mon May 6 10:09:38 PDT 2013


On Mon, May 06, 2013 at 06:06:44PM +0200, Joseph Rushton Wakeling wrote:
> Hello all,
> 
> I've recently been having some very tentative thoughts about whether
> to try and develop an effective graph/network library in D.  The goal
> would be to try and have a library that allows for highly efficient
> representation of graphs and multigraphs, both as static and dynamic
> entities, with very fast retrieval of graph properties (e.g. the
> neighbours of a given vertex, the weights of given links) for use in
> simulations _on_ the graph, and very fast calculation of graph metrics
> of various kinds.

I would be interested in something like this in D. Though in my case,
I'm not so much interested in the graph data structures themselves, as
the graph algorithms that apply to them, that can be used to solve
various programming problems, like connectivity analysis of large data
sets, reachability analysis, and the like. Graph coloring algorithms are
also of some interest to me, since they underlie some of the
visualization techniques that I'm interested in.

So what I envision as far as graph libraries in D are concerned is
something like the generic range API, where there is a fixed convention
as to what constitutes a graph/digraph/etc., and the library itself is a
template library that is able to handle any data type that conforms to
the requirements. Of course, this does not preclude specific,
non-generic data structures used internally by said graph algorithms,
but for maximal usability, the API shouldn't require the user to perform
large-scale conversion of existing data structures into graph-specific
structures just so he can run, say, a reachability analysis on it. When
working with large data sets, one really wants to just map class X to
nodes, class Y to edges, now give me the connectivity of the resulting
graph (rather than looping over the existing data set to construct a
separate graph representation thereof).


[...]
> My main reasons for being tentative about moving forward with this are
> -- well, first, it's a major undertaking; as my day job is producing
> research, not code, there's something of a pressure to just use the
> tools already available rather than trying to optimize for my personal
> language choice, and I don't want to start something only to drop off
> because my research activities are taking me in a different direction.
> 
> Second, I don't know how useful it would be to others -- Python and
> C/C++ are still definitely the standard (and growing) tools.  So it
> would be good to know if anyone else would like to have such a library
> explicitly in D.

I'm interested.

Some years ago I implemented Tarjan's algorithm for computing
strongly-connected components in C++. I needed to use it on an
n-dimensional grid of cells, so I tried to make it generic, but due to
limitations in C++ generic programming, I ended up just using an
abstract class for representing the graph. This required implementing a
separate graph class for every data structure I wanted to run the
algorithm on, which is rather ugly. If something like this were done in
D, I expect it would have a friendlier API by allowing Phobos-style
genericity (using range-like abstraction for graph concepts like nodes
and edges, alias for callbacks/closures to abstract away graph
implementation details, etc.). One could, for example, allow overriding
of methods that produce the SCC graph so that the result is produced in
a custom target type that can be used directly by the application
elsewhere.


[...]
> Now, that said, there is one very big positive motivation -- I think
> that D provides a language in which one could create a graph library
> with a genuinely friendly and well-designed API and with easily
> readable code, yet which has all the efficiency and scalability of
> igraph.  And I'd much rather work with that than any of the other
> solutions out there.

+1.


> Some goals for a project like this would be something along the
> following lines:
> 
>    * Hyper-efficient performance for storage and manipulation of graphs.
>      I'd want to be able to perform analyses of and/or run simulations
>      on extremely large graph sizes.  This would include features such
>      as effective use of parallelism (when appropriate), dynamic
>      recalculation of various network quantities as nodes/edges are
>      added/removed, and so on.

This is a bit beyond what I need, but if we can have it, why not?


>    * Written in idiomatic D style as exemplified by the best of
>      Phobos, making best use of D's generics, templates, etc.

Definitely!


>    * Easy to hook into other languages -- I think my colleagues would
>      very much appreciate a Python interface.

How would you go about doing this, though? Wouldn't it require some kind
of wrapper API to make it compatible with the Python interpreter?

Though, if the API is done correctly in a generic way (ala the range
API), this should amount to just a simple set of wrappers that maps
various graph concepts directly onto Python types. It would also
maximize performance by not requiring an intermediate conversion step.


>    * Ideally, would rely _only_ on Phobos and not on any 3rd-party
>      libraries.  It may turn out that some desirable functionality is
>      not yet in Phobos (or does not yet perform adequately), in which
>      case those features should be implemented or improved in Phobos
>      rather than this library.

Yes.


> Anyway, for a very tentative proposal (and I have to say, I'm really
> not sure whether I will commit to this), I think that's a long enough
> email for now -- I'd really welcome people's thoughts on these ideas
> and to get a feel of whether there's interest in seeing something like
> this implemented in D.
[...]

I will be glad to have generic graph algorithms available in D, to
minimize the need for reinventing the wheel. I've seen too much code
that tries to reinvent known graphing algorithms (usually rather poorly)
just because existing graphing libraries are too hard / too obscure to
reuse.


T

-- 
If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher


More information about the Digitalmars-d mailing list