Library Development: What to finish/flesh out?
Tomek Sowiński
just at ask.me
Thu Mar 17 12:48:20 PDT 2011
dsimcha napisał:
> I've accumulated a bunch of little libraries via various evening and weekend
> hacking projects over the past year or so, in various states of completion.
> Most are things I'm at least half-considering for Phobos, though some belong
> as third-party libs. I definitely don't have time to finish/flesh out all of
> them anytime soon, so I've decided to ask the community what to prioritize.
> Below is a summary of everything I've been working on, with its current level
> of completion. Please let me know the following:
>
> 1. A relative ordering of how useful you think these libraries would be to
> the community.
>
> 2. In absolute terms, would you find this useful?
>
> 3. For the Phobos candidates, whether they're general enough to belong in the
> **standard** library.
>
> List in order from most to least finished:
>
> 1. Rational: A library for handling rational numbers exactly. Templated on
> integer type, can use BigInts for guaranteed accuracy, or fixed-width integers
> for more speed where the denominator and numerator will be small. Completion
> state: Mostly finished. Just need to fix a litte bit rot and submit for
> review. (Phobos candidate)
I'd find it useful. As for its presence in Phobos, I'm uncertain if it's in enough demand.
> 2. RandAA: A hash table implementation with deterministic memory management,
> based on randomized probing. Main advantage over builtin AAs is that it plays
> much nicer with the GC and multithreaded programs. Lookup times are also
> expected O(1) no matter how many collisions exist in modulus hash space, as
> long as there are few collisions in full 32- or 64-bit hash space. Completion
> state: Mostly finished. Just needs a little doc improvement, a few
> benchmarks and submission for review. (Phobos candidate)
Useful for me and in Phobos.
> 3. TempAlloc: A memory allocator based on a thread-local segmented stack,
> useful for allocating large temporary buffers in things like numerics code.
> Also comes with a hash table, hash set and AVL tree optimized for this
> allocation scheme. The advantages over plain old stack allocation are that
> it's independent of function calls (meaning you can return pointers to
> TempAlloc-allocated memory from a function, etc.) and it's segmented, meaning
> you can allocate huge buffers w/o risking stack overflow. Its main weakness
> is that this stack is not scanned by the GC, meaning that you can't store the
> only reference to a GC-allocated piece of memory here. However, in practice
> large arrays of primitives are an extremely common case in
> performance-critical code. I find this module immensely useful in dstats and
> Lars Kyllingstad uses it in SciD. Getting it into Phobos would make it easy
> for other scientific/numerics code to use it. Completion state: Working and
> used. Needs a litte cleanup and documentation. (Phobos candidate)
Useful for me, don't know if for everyone else.
> 4. Streaming CSV Parser: Parses CSV files as they're read in, a few
> convenience functions for extracting columns into structs. If Phobos every
> gets SQLite support I'll probably add sugar for turning a CSV file into an
> SQLite database, too. Completion state: Prototype working, needs testing,
> cleanup and documentation. (Phobos candidate)
You mean a lazy slurp? It'd be useful for everyone.
> 5. Matrix operations: SciD improvements that allow you to write matrix
> operations that look like normal math/MATLAB and optimizes them via expression
> templates so that a minimal number of temporary matrices are created.
> Uses/will use BLAS for multiplication. Completion state: Addition
> implemented. Multiplication not.
It is worth considering standardizing at least matrix expressions in Phobos. The motivation is analogous to ranges -- to run an algorithm from lib A on a matrix container from lib B. C++ would be green with envy.
I'd be glad to be part of the effort once I'm done with xml.
> 6. Machine learning: Decision trees, KNN, Random Forest, Logistic
> Regression, SVM, Naive Bayes, etc. This would be a dstats module. Completion
> state: Decision trees prototyped, logistic regression working.
I'd find it useful, I think anyone who's into this would too.
> 7. std.mixins: Mixins for commonly needed boilerplate code. I stopped
> working on this when Andrei suggested that making a collection of mixins into
> a module is a bad idea. I've thought about it some more and I respectfully
> disagree. std.mixins would be a one-stop shop for pretty much any boilerplate
> you need to inject, and most of this code doesn't fit in any other obvious
> place. Completion state: A few things (struct comparison, simple class
> constructors, Singleton pattern) prototyped. (Phobos candidate)
I'm afraid I also think functionality should be categorized by the purpose it serves rather than implementation technique.
> 8. GZip support in std.file: I'll leave the stream stuff for someone else,
> but just simple stuff like read(), write(), append() IMHO belongs in std.file.
> Completion state: Not started, but this is the easiest of the bunch to
> implement. (Phobos candidate)
I don't know really...
--
Tomek
More information about the Digitalmars-d
mailing list