DConf 2013 keynote

Fri May 10 14:31:00 PDT 2013

On Fri, May 10, 2013 at 09:55:57PM +0200, sk wrote:
> >In any case, I totally agree that if a language *needs* an IDE in
> >order to cope with the amount of required boilerplate, then
> >something is clearly very, very wrong at a fundamental level.
> 
> May be this is true for expert or professional programmers. But for
> people like me who only use D occasionally an IDE is a must.

My point was that while an IDE is helpful (e.g. for people who aren't
professional programmers, just want to get the job done, etc.), a
language should not *depend* on an IDE to insert boilerplate. I concede
that IDEs are helpful and needed for newbies and non-professional
programmers, but the *language* itself should be usable without one.
Requiring one because otherwise the amount of required boilerplate would
be unmanagable, is a symptom that there is something wrong with the
language's design.

[...]
> I think lack of IDEs will prevent many beginners from trying out a
> new language. Especially after getting spoiled with IDEs like
> netbeans, visual studio etc.
[...]

Agreed. But my point was that the language shouldn't *depend* on an IDE
in order to be usable. If a language requires an IDE because you need to
insert 100 lines of boilerplate in every program you write, then one has
to wonder, why not make those 100 lines *implicit*? They are not
conveying anything useful about the program, because they will be
identical (or mostly identical) every single time. The fact that the
language did *not* make it implicit then begs the question of what went
wrong in its design that you have to repeatedly specify what should
already be obvious to the language/compiler.

As they say in information theory: it is the stuff that stands out, that
is different from the rest, that carries the most information. The stuff
that's pretty much repeated every single time conveys very little
information. This is why newspaper headings tend to leave out very
common words like "the", "a", "is", etc., because these words take up
space but convey little to no additional information -- you can drop
them and still get the gist of what the headlines are saying.

A good programming language is one where the code says all the important
things, and leaves out most of the unimportant or obvious things. It's
just like Walter said in the talk: the file-reading function without
scope guards was full of goto's and error-checking, stuff that pretty
much is (or should be) done everywhere. It clutters the code and
obscures the salient points. It's a headline with all the "the"'s,
"a"'s, "is"'s. In contrast, the version with scope guards can be read
sequentially -- all the peripheral if's and goto's are nicely abstracted
away, leaving only the salient points of the code: allocate a buffer,
read the data, return the data. A glance at the code immediately tells
you its key points. No distracting sidelines of error-checking, goto's,
labels, or any of that nonsense.

In contrast, consider a language like C. The *correct* way of writing C
code is something like this:

	int myfunc(struct A *a, struct B *b, struct C *c) {
		/* Boilerplate: to avoid slip-ups with uninitialized
		 * pointers, must always set them to NULL. */
		void *buf = NULL;
		struct D *d = NULL;

		/* Boilerplate: check for NULL pointers */
		if (!a || !b || !c)
			/* Boilerplate: everybody and their neighbour's
			 * dog defines their own set of macros for
			 * return values; how do you remember which one
			 * goes with which function(s)? */
			return INVALID_ARGS_ERROR;

		buf = malloc(some_size);

		/* Boilerplate: must check NULL return from malloc,
		 * every single time. */
		if (!buf)
			return MEMORY_ERROR;

		/* Boilerplate: every function call must be wrapped in
		 * an if-goto, because the function may have returned an
		 * error. */
		if (anotherfunc(a, buf) != OK)
			goto ERROR;

		/* And yes, technically, you need to do this for things
		 * like printf too! Guess how many C coders do this?
		 * That's right, nobody does. It's wrong, and leads to
		 * hilarious problems when stdout isn't pointing to what
		 * the programmer thought it was. Or not-so-hilarious,
		 * if stdout was closed and a database handle was
		 * reopened and reused stdout's file descriptor
		 * number...
		 */
		if (printf("Hello, world!\n") != 0)
			goto ERROR;

		/* More boilerplate */
		if ((d = create_instance_of_d()) == NULL)
			goto ERROR;

		/* Now our boilerplate needs to use a different goto
		 * label, 'cos now we have to cleanup d, whereas we
		 * didn't need to before! Can you imagine the hilarity
		 * after 20 people change this code later on in the
		 * project's life, and one of them forgets the fact that
		 * after this point a different goto label is needed? */
		if (yetanotherfunc(b, d) != OK)
			goto ERROR2;

		if (yesmoreboilerplate(c, buf) != OK)
			goto ERROR2;

		/* Finally, success! */

		/* Um... not just yet, need more boilerplate: cleanup
		 * after ourselves */
		free(d);
		free(buf);

		/* Sigh... about time we got done */
		return OK;

		/* Caveat: the order of labels must be the *reverse* of
		 * the order they appear in the code, to ensure things
		 * are destroyed in the right order. Don't laugh -- I've
		 * seen "enterprise" code that gets this wrong. */
	ERROR2:
		/* Problem: by this point in the code, do you remember
		 * that d was supposed to be freed? */
		free(d);
	ERROR:
		/* Or buf? */
		free(buf);

		/* Problem: what if the caller forgets to check our
		 * return code? Or checks it against the wrong set of
		 * error macros? */
		return ERROR_CODE;

		/* Hope and pray the program won't crash when it gets
		 * back to the caller who forgets to check for error
		 * codes and just barges ahead blindly. */
	}

Note how much boilerplate is necessary to make the code work
*correctly*. (Yes I know you can merge the error and non-error returns
by checking for NULL in *buf and *d, thereby getting rid of the
duplicated calls to free(), but that doesn't get rid of the problem,
just dresses it differently.) The most obvious way to write this code
leaks memory and doesn't handle errors.  In fact, the above code
actually isn't good enough: printf sets errno, and to *really* get the
code right, you should be checking and possibly propagating, the value
of errno after the printf call fails.  Yes, more boilerplate. Lots more.
Every single time you call a system function.

This is one of the things I can't stand about C. Seriously, C coders
should use an IDE to get basic things like this right, even if most of
them are too macho to admit it.

This is what I mean when the language is basically unusable without an
IDE. Yes, IDEs can help adoption of a language, and I don't dispute
that, but when code in that language cannot be written correctly without
an IDE (or at least, not easily), then something is horribly, horribly
wrong with that language.

(The D version of the above function, by contrast, is vastly more
readable and maintainable on almost every count, and requires no IDE to
get it right.)

T

-- 
"640K ought to be enough" -- Bill G., 1984. "The Internet is not a primary goal for PC usage" -- Bill G., 1995. "Linux has no impact on Microsoft's strategy" -- Bill G., 1999.