Changeset 442, implicit Vs explicit

Tue Apr 27 08:50:55 PDT 2010

Walter Bright:

> I have used it, though I use statically initialized arrays at all very rarely.
> An example is the _ctype[] array which leaves the upper 128 entries unspecified,
> and therefore 0.

Both Phobos and druntime seem to contain a copy of the module ctype (o.O), it contains the array:
immutable ubyte _ctype[128] = [_CTL,_CTL,_CTL,_CTL,_CTL,_CTL,_CTL,_CTL,
immutable ubyte _ctype[128] = [...

In both cases the array is defined as 128 items and I have counted it contains 128 items. So I think you are wrong.

But even if you meant something like this:

immutable ubyte _ctype2[256] = [/*128 items here*/];

My opinion is that it's bug-safer something like:

@implicit_filling immutable ubyte _ctype2[256] = [/*128 items here*/];

Othrerwise this will another bit of work to do for future D lints :-)

> Anytime you statically initialize an array with more than a small number of values,
> it's bug-prone, even if the number of elements match the dimension. There's no check
> against transposing entries, or mis-typing the entries themselves.

I agree, literals can hide many other kinds of bugs. Bugs caused by the length correspondence is just one of them and probably it's not even the most common. But:
- keeping this souce of possible bugs doesn't help;
- having the compiler test for the equality of the two lengths at compile-time doesn't give the programmer a false sense of security, it's just a natural extra test the programmer (me) expects the compiler to perform. So I think this test is not going to increase the bug count in any case.

> Your options are to either write unit tests for the table, or have the table
> generated by another program.

One the main points of unit testing is (just as Literate programming, that was present in D1, and it is very common in the Haskell world, see for example: http://www.imperialviolet.org/binary/jpeg/ ) to offer a secont point of view to see a block of code. This allows to spot bugs much more efficiently (in literate programming the second point of view is the textual decription).

Tests are code too, so they too can contain bugs. So when possible you often want your tests to be at a bit higher level of abstraction (to "summarize" the code), so they can be shorter, and hopefully contain less bugs than the code they test (in practice often my unit tests are longer than the code they test).

If I have code like this it's legal D2 code:

// ...
enum int N_DIRECTIONS = 4;
// ...
// ...
enum string[N_DIRECTIONS] cardinal_directions = ["north", "south", "east"];
// ...

Then you suggest to write a unit test for the table to spot its bug, but what can I put inside this unittest?

This is not so useful, because it doesn't catch the bug:

unittest { // tests of global data
  static assert(cardinal_directions.length == 4);
}

One thing I can do is to test each item, for example:

unittest { // tests of global data
  static assert(cardinal_directions == ["north", "south", "east", "west"]);
}

Or:

unittest { // tests of global data
  static assert(cardinal_directions[0] == "north");
  static assert(cardinal_directions[1] == "south");
  static assert(cardinal_directions[2] == "east");
  static assert(cardinal_directions[3] == "west");
}

Such unit tests catch the bug, but in both cases they just repeat the contents of the data, so they aren't at a higher level compared to the code they test. So they are not so useful.

I can't help but think that the right solution for this situation is at at a compiler level, not at unit test level.

You have not commented about the other syntax I (and others) have suggested (that has a different purpose, it's not meant to replace the item count test):
arr[$] = [/*...*/];
So I guess you are not so interested in this too.

Bye,
bearophile