Library Development: What to finish/flesh out?

dsimcha dsimcha at yahoo.com
Thu Mar 17 08:33:10 PDT 2011


I've accumulated a bunch of little libraries via various evening and weekend
hacking projects over the past year or so, in various states of completion.
Most are things I'm at least half-considering for Phobos, though some belong
as third-party libs.  I definitely don't have time to finish/flesh out all of
them anytime soon, so I've decided to ask the community what to prioritize.
Below is a summary of everything I've been working on, with its current level
of completion.  Please let me know the following:

1.  A relative ordering of how useful you think these libraries would be to
the community.

2.  In absolute terms, would you find this useful?

3.  For the Phobos candidates, whether they're general enough to belong in the
**standard** library.

List in order from most to least finished:

1.  Rational:  A library for handling rational numbers exactly.  Templated on
integer type, can use BigInts for guaranteed accuracy, or fixed-width integers
for more speed where the denominator and numerator will be small.  Completion
state:  Mostly finished.  Just need to fix a litte bit rot and submit for
review.  (Phobos candidate)

2.  RandAA:  A hash table implementation with deterministic memory management,
based on randomized probing.  Main advantage over builtin AAs is that it plays
much nicer with the GC and multithreaded programs.  Lookup times are also
expected O(1) no matter how many collisions exist in modulus hash space, as
long as there are few collisions in full 32- or 64-bit hash space.  Completion
state:  Mostly finished.  Just needs a little doc improvement, a few
benchmarks and submission for review.  (Phobos candidate)

3.  TempAlloc:  A memory allocator based on a thread-local segmented stack,
useful for allocating large temporary buffers in things like numerics code.
Also comes with a hash table, hash set and AVL tree optimized for this
allocation scheme.  The advantages over plain old stack allocation are that
it's independent of function calls (meaning you can return pointers to
TempAlloc-allocated memory from a function, etc.) and it's segmented, meaning
you can allocate huge buffers w/o risking stack overflow.  Its main weakness
is that this stack is not scanned by the GC, meaning that you can't store the
only reference to a GC-allocated piece of memory here.  However, in practice
large arrays of primitives are an extremely common case in
performance-critical code.  I find this module immensely useful in dstats and
Lars Kyllingstad uses it in SciD.  Getting it into Phobos would make it easy
for other scientific/numerics code to use it.  Completion state:  Working and
used.  Needs a litte cleanup and documentation.  (Phobos candidate)

4.  Streaming CSV Parser:  Parses CSV files as they're read in, a few
convenience functions for extracting columns into structs.  If Phobos every
gets SQLite support I'll probably add sugar for turning a CSV file into an
SQLite database, too.  Completion state:  Prototype working, needs testing,
cleanup and documentation.  (Phobos candidate)

5.  Matrix operations:  SciD improvements that allow you to write matrix
operations that look like normal math/MATLAB and optimizes them via expression
templates so that a minimal number of temporary matrices are created.
Uses/will use BLAS for multiplication.  Completion state:  Addition
implemented.  Multiplication not.

6.  Machine learning:  Decision trees, KNN, Random Forest, Logistic
Regression, SVM, Naive Bayes, etc.  This would be a dstats module.  Completion
state:  Decision trees prototyped, logistic regression working.

7.  std.mixins:  Mixins for commonly needed boilerplate code.  I stopped
working on this when Andrei suggested that making a collection of mixins into
a module is a bad idea.  I've thought about it some more and I respectfully
disagree.  std.mixins would be a one-stop shop for pretty much any boilerplate
you need to inject, and most of this code doesn't fit in any other obvious
place.  Completion state:  A few things (struct comparison, simple class
constructors, Singleton pattern) prototyped.  (Phobos candidate)

8.  GZip support in std.file:  I'll leave the stream stuff for someone else,
but just simple stuff like read(), write(), append() IMHO belongs in std.file.
 Completion state:  Not started, but this is the easiest of the bunch to
implement.  (Phobos candidate)


More information about the Digitalmars-d mailing list