Phobos Wish List/Next in Review Queue?

Sat Nov 19 19:02:33 PST 2011

Now that we've got a lot of contributors to Phobos and many projects in 
the works, I decided to start a thread to help us make a rough plan for 
Phobos's short-to-medium term development.  There are three goals here:

1.  Determine what's next in the review queue after std.csv (voting on 
std.csv ends tonight, so **please vote**).

2.  Come up with a wish list of high-priority modules that Phobos is 
missing that would make D a substantially more attractive language than 
it is now.

3.  Figure out who's already working on what from the wish list and what 
bottlenecks, if any, are getting in the way and what can be done about them.

The following is the wish list as I see it.  Please suggest additions 
and correct any errors, as this is mostly off the top of my head.  Also, 
status updates if you're working on any of these and anything 
substantial has changed would be appreciated.

*  Some higher level networking support, such as HTTP, FTP, etc.  (Jonas 
Drewsen's CURL wrapper handles a lot of this and may be ready for a 
second round of review.)

*  Serialization.  (Jacob Carolberg's Orange library might be a good 
candidate.  IIRC he said it's close to ready for review.)

*  Encryption and hashing.  (This is more an implementation problem than 
a design problem and AFAIK noone is working on it.)

*  Containers.  (AFAIK noone is working on this.  It's tough to get 
started because, despite lots of discussion at various times on this 
forum, noone seems to really know what they want.  Since the containers 
in question are well-known, it's much more a design problem than an 
implementation problem.)

*  Allocators.  (I think Phobos desperately needs a segmented 
stack/region based allocator and I've written one.  I've also tried to 
define a generic allocator API, mostly following Andrei's suggestions, 
but I'll admit that I didn't really know what I was doing for the 
general API.  Andrei has suggested that allocators should have 
real-world testing on containers before being included in Phobos. 
Therefore, containers block allocators and if the same person doesn't 
write both, there will be a lot of communication overhead to make sure 
the designs are in sync.)

*  Streams.  (Another item where the bottleneck is mostly at the design 
level and people not really knowing what they want.)

*  Compression/archiving.  (Opening standard compressed/archived file 
formats needs to just work.  This includes at least zip, gzip, tar and 
bzip2.  Of course, zip already is available and gzip is supported by the 
zlib module but with a crufty C API.  At least gzip and bzip2, which are 
stream-based as opposed to file-based, should be handled via streams, 
which means that streams block compression/archiving.  Also, since tar 
and zip are both file based, they should probably be handled by the same 
API, which might mean deprecating std.zip and rewriting it.)

*  An improved std.xml.  (I think Thomas Sowinski is working on a 
replacement, but I haven't seen any updates in a long time.)

*  Matrices and linear algebra.  (Cristi Cobzarenco's GSoC project is a 
good starting point but it needs polish.  I've been in contact with him 
occasionally since GSoC ended and he indicated that he wants to get back 
to working on it but doesn't have time.  I've contributed to it 
sparingly, but find it difficult because I haven't gotten around to 
familiarizing myself with the implementation details yet, and it's hard 
to get into a project that complex with a few hours a week as opposed to 
focusing full time on it.)

*  std.database.  (Apparently Steve Teale is working on this.  This is a 
large, complicated project because we're trying to define a common API 
for a variety of RDBMSs.  Again, it's more a design problem than an 
implementation problem.)

*  Better support for creating processes/new std.process.  (Lars 
Kyllingstad wrote a replacement candidate for Posix and Steve 
Schveighoffer ported it to Windows, but issues with the DMC runtime 
prevent it from working on Windows.)

*  Parallel algorithms.  (I've implemented a decent amount of these in 
my std.parallel_algorithm Github project, but I've become somewhat 
frustrated and unmotivated to finish this project because so many of the 
relevant algorithms seem memory bandwidth bound and aren't substantially 
faster when parallelized than when run serially.)

After writing this, the general pattern I notice is that lots of stuff 
is blocked by design, not implementation.  In a lot of cases people 
don't really know what they want and analysis paralysis results.