Human unreadable documentation - the ugly seam between simple D and complex D

Thu Mar 26 12:32:51 PDT 2015

There is a discussion about D vs Go going on in several 
threads(yey for multithreading!), and one thread is about an 
article by Gary Willoughby that claims that Go is not suitable 
for sophisticated 
programmers(http://forum.dlang.org/thread/mev7ll$mqr$1@digitalmars.com). 
What's interesting about this one is the reddit comments, which 
turned into an argument between simple languages that average 
programmers can use and complex languages that only the top 1% of 
intelligent programmers can use, but they can extract more out of 
them.

But the thing is - the world of the top programmers is not really 
separate from that of average programmers. Professional 
development teams can have a few top programmers and many average 
one, all be working on the same project. Open source projects can 
have top programmers working on the core while allowing average 
programmers to contribute some simple features. Top programmers 
can write libraries that can be used by average programmers.

To allow these things, top programmers and average programmers 
should be able to work on the same language. Of course, any 
language that average programmers can master should be easy for a 
top programmer to master - but the thing is, we also want the top 
programmer to be able to bring more out of the language, without 
limiting them by it's over-simplicity. This will also benefit the 
average programmers, since they also improve the quality of the 
libraries and modules they are using.

This idea is nothing new, and was mentioned in the main(=biggest) 
current D vs Go 
thread(http://forum.dlang.org/thread/mdtago$em9$1@digitalmars.com?page=3#post-jeuhtlocousxtezoaqqh:40forum.dlang.org). 
What I want to talk about here is the seams. The hurdles that in 
practice make this duality harder.

Let's compare it to another duality that D(and many other 
languages, mainly modern systems languages) promotes - the 
duality between high-level and low-level. Between write-code-fast 
and write-fast-code.

The transition between high-level and low-level code in D 
consists by a change of the things uses - which language 
constructs, which idioms, which functions. But there aren't any 
visible seams. You don't need to use FFI or to dynamically load a 
library file written in another language or anything like that - 
you simply write the high-level parts like you would write 
high-level code and the low-level parts like you would write 
low-level code, and they just work together.

The duality between high-level D and low-level D is seamless. The 
duality between simple D and complex D - not so much.

The seams here exist mainly in understanding how to use complex 
code from simple code. Let's take std.algorithm(.*) for example. 
The algorithms' implementations there are complex and use 
advanced D features, but using them is very simple. Provided, of 
course, that you know how to use them(and no - not everything 
that you know becomes simple. I know how to solve regular 
differential equations, but it's still very complex to do so).

The problem, as Andrei Alexandrescu pointed 
out(http://forum.dlang.org/thread/mdtago$em9$1@digitalmars.com?page=6#post-mduv1i:242169:241:40digitalmars.com), 
is learning how to use them. Ideally you'd want to be able to 
look at a function's signature and learn from that how to use it. 
It's name and return type should tell you what it does and it's 
argument names and types should tell you what to send to it. The 
documentation only there for a more through description and to 
warn you about pitfalls and edge cases.

But when it comes to heavily templated functions - understanding 
the signature is HARD. It's hard enough for the top programmers 
that can handle the complex D features - it's much harder for the 
average programmers that could have easily used these functions 
if they could just understand the documentation.

Compare it, for example, to Jave. Even if a library doesn't 
contain a single documentation comment, the auto-generated 
javadoc that contains just the class tree and method signatures 
is usually enough to get an idea of what's going where. In D, 
unless the author has provided some actual examples, you are 
going to have a hard time trying to sort out these complex 
templated signatures...

That's quite an hurdle to go though when wanting to use complex 
code from simple code(or even from other complex code). That's 
the ugly seam I'm talking about.

Now, if you are working on a big project(be it commercial or 
open-source), you can find lot's of examples how to use these 
complex functions, and that's probably how you'd tackle the 
problem. When you are using some library you usually don't have 
that luxury - but these libraries usually have the generated ddoc 
at their website. Of course - that generated ddoc is full with 
complex templated signatures, so that's not very helpful...

So, what can be done? Maybe the ddoc generator, instead of 
writing the whole signature as-is, can emit a more human-readable 
version of it?

Let's look at the example Andrei mentioned - startsWith. Let's 
take a look at the first overloaded signature:

uint startsWith(alias pred = "a == b", Range, Needles...)(Range 
doesThisStart, Needles withOneOfThese) if (isInputRange!Range && 
Needles.length > 1 && is(typeof(.startsWith!pred(doesThisStart, 
withOneOfThese[0])) : bool) && 
is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[1..$])) 
: uint));

Let's break it down and see what the user needs in order to use 
the function:

`uint` - the return type. Needed.
`startsWith` - the function name. Needed.
`(alias pred = "a == b",` - a template argument that the user 
might want to supply - Needed.
`Range, Needles...)` - template arguments that should usually be 
inferable. The function won't work without them, but since the 
user doesn't actually supply them - I'll mark them as not needed.
`(Range doesThisStart, Needles withOneOfThese)` - the function's 
arguments. Needed.

The rest are constraints that check the template arguments. They 
aren't needed when you try to use the function - though they 
might be helpful at figuring out why the compiler yells at you 
when you use it wrong.

So, if we take only the needed parts, we get this signature:

uint startsWith(alias pred = "a == b")(Range doesThisStart, 
Needles withOneOfThese);

Well, doesn't this look much easier to grasp? Of course, it omits 
some very critical information. It doesn't tell you what are 
`Range` and `Needles` - you can look for these types in the docs 
and find nothing. It also doesn't tell you that `Needles` is 
variadic.

Well - what's stopping us from adding this information *below* 
the signature? What if ddoc would generate something like this:

uint startsWith(alias pred = "a == b")(Range doesThisStart, 
Needles... withOneOfThese);
   where:
     Range is an inferred template argument
     Needles is a variadic inferred template argument
     isInputRange!Range
     Needles.length > 1
     is(typeof(.startsWith!pred(doesThisStart, withOneOfThese[0])) 
: bool)
     is(typeof(.startsWith!pred(doesThisStart, 
withOneOfThese[1..$])) : uint)

We've broken the signature into the parts required to use the 
function and the parts required to FULLY understand the previous 
parts. The motivation is that the second group of parts is also 
important, so it needs to be there, but it creates a lot of 
unneeded noise so it shouldn't be a direct part of the 
signature(at least not in the doc). It's similar to the docs of 
other types used in the signature - it's important to have these 
docs somewhere accessible, but you don't want to add them in the 
middle of the signature because it'll make it unreadable.

This idea, of course, is not a finally cooked proposal yet. We 
need a way to tell ddoc which template arguments are supposed to 
be inferred(can this always be done automatically?) and the last 
two entries in my example are not super-trivial to grok(I can 
rewrite them by-hand to make them super-simple - but can ddoc do 
it automatically? and how?). The point of this thread is to start 
a discussion about making ddoc generate documentations that are 
more... well... human readable.