std.experimental.checkedint is ready for comments!

Wed Jun 15 11:34:15 PDT 2016

On Wednesday, 15 June 2016 at 16:40:19 UTC, Andrei Alexandrescu 
wrote:
> Thanks for this work.
> [...]
> I think there are a few considerable issues with the proposal, 
> but also that all are fixable.

Your message was very long, so for the moment I'm going to filter 
it down to just the high-level design criticism. (The rest is 
unimportant unless/until we reach consensus on the design, 
anyway.)

> * The opening documentation states this proposal is concerned 
> with adding checking capabilities to integral types. It is a 
> full-blown package with 6 modules totaling 4690 lines as wc 
> counts. (For comparison: std.experimental.allocator, which 
> offers many facilities and implements a number of difficult 
> structures and algorithms, has 11831 lines.) That's a high 
> budget and a high imposition on users for adding checks to 
> integral types. Such extensive style of coding goes against the 
> D style we want to promote.
> [...]
> The budget I'd  establish for this is one parameterized type in
> one module of manageable size. Several parameterized types would
> be okay if they characterize distinct abstractions and use the 
> same backend. Anything else is an indication of a design run 
> awry.

This can be summarized as, "It's too big and complicated."

`checkedint` as it stands is, I believe, fairly close to the 
smallest implementation possible in D today, within the 
constraints of the features demanded by the community in past 
discussions, coupled with my high-level design goals.

If you want something shorter or simpler, you will have to cut 
features or compromise the design in other ways. (Or improve the 
D language to facilitate a more elegant design.)

Some features and design goals that combine to motivate my design:

    1) Checked types must signal an error (somehow) whenever their 
behaviour
       deviates from that of an ideal mathematical integer.

    2) It should be possible to recover from errors - using 
`assert(false)` or
       a deliberate divide-by-zero to crash the program is bad 
design unless
       the condition that triggers it is never supposed to happen, 
ever.

    3) Performance (with respect to both speed and memory use) 
should be as
       close as possible to that of the built-in machine integer 
types.

    4) The API should minimize verbosity and ceremony, because 
otherwise hardly
       anyone will use it - people generally prefer convenience 
over safety.

    5) Writing generic code that works correctly with both checked 
and unchecked
       types must be easy.

    6) The API must make safe usage easy, and (accidental) unsafe 
usage hard,
       because people generally don't pay much attention to the 
docs (even if
       they're good). A false sense of security is worse than none 
at all.

    7) The API must be usable in `nothrow @nogc` code.

    8) The number of distinct template instantiations generated in 
natural use
       must be finite, to prevent excessive combinatorial 
explosion when
       checked types are used in public APIs. (Templates that are 
just aliases,
       and small functions that always inline don't count against 
this.)

> Also it is worrisome that one type wasn't enough (in spite of 
> extensive
> parameterization with policies) and two are needed, with subtle 
> differences
> between them that need to be summarized in a table.

The reason for the `SmartInt` versus `SafeInt` split is that with 
D's current semantics
(4) and (5) conflict.

`SmartInt` prioritizes (4); `SafeInt` and `DebugInt` prioritize 
(5).

> Getting to the design: the root of the problem is a byzantine 
> design that is closed to extension.

The design was closed deliberately because of (8). Template bloat 
is a major concern, even with the current finite design.

I want `checkedint` to be usable in public APIs, and that 
requires some standardization of error handling and base types to 
be enforced upon the users. Otherwise, everyone will choose 
something different and all template instantiations involving 
integer types will become practically single-use.

> Looking at the IntFlagPolicy, it offers three canned behavior: 
> throws, asserts, and noex.

The choice of policies is motivated by the natural 
incompatibility of (2), (4), (6), and (7). I built in enough 
variety to allow people to choose their own priorities among 
those goals, and no more because of (8).

> * One of the first things I looked for was establishing bounds 
> for numbers, like Smart!(int, 0, 100) for percentage. For all 
> its might, this package does not offer this basic facility, and 
> from what I can tell does not allow users to enforce it via 
> policies.

Here you are suggesting adding even more complexity to a design 
that you have already de-facto rejected as overly complex. As 
discussed earlier in this very thread, I studied adding support 
for arbitrary bounds and decided not to pursue that right now 
because implementing it in a performant way would greatly 
increase the complexity of `checkedint`, and make the template 
bloat problem much worse.

> Also, this suggests that other types should be considered, how 
> about Smart!bool and Smart!double?

Neither `bool` nor `double` has the kind of severe-but-fixable 
safety and correctness issues that the machine integer types do, 
which motivates the `checkedint` design. No doubt someone can 
make up some sort of meaning to attach to those symbols, but it 
will most likely have nothing to do with `SmartInt`.

> * Not sure why divPow2 is needed, why not some enforcedOp!"<<"

Because (although similar) a bit shift is actually semantically 
different than dividing or multiplying by a power of two. The bit 
shift implies different rules for rounding and overflow.

I realized in testing that even for `SmartInt`, the bit shift 
semantics are still useful sometimes, and decided that it was 
better not to confuse the two. A `smartOp` version of the bit 
shifts is necessary because the built-in bit shifts have some 
undefined behaviour that needs to be fixed.

> etc. Same about pow, why not enforcedOp!"^^"?

`pow()` exists as a free function to satisfy (5), and because the 
`^^` and `^^=` operators both have language-level bugs that 
currently make their use incompatible with (6).

> I see little value in free functions such as e.g. abs() because 
> they are trivial one-liners. I understand the need for 
> completeness, but it seems a good aspiration for consistency is 
> being marred by a bunch of code pulp that really does nothing 
> interesting. Probably not worth it.

`checkedint`-aware versions of functions like `abs()` are 
necessary to satisfy (4) and (6) together.

> The hook may have state (e.g. hysteresis, NaN flag, error 
> state, etc) so that's why it may be embedded within Checked.

The early iterations of `checkedint` worked this way (although I 
had no plans to support user-defined hooks). I implemented and 
debugged it, and thought it was about ready to submit many months 
ago.

Then I actually tried *using* it, and hated it. The problem with 
using a NaN state, is that in nothrow code you have to manually 
check it before *every* call to an external 
(non-`checkedint`-aware) function, or you may accidentally lose 
it.

As a result, safe NaN-based APIs violate (4), while concise APIs 
violate (6). The IEEE-inspired sticky flags feature is my 
solution to this problem, and it is far more pleasant to work 
with in practice - as well as faster and more memory efficient.

Meanwhile, in code that allows exceptions, there is no reason to 
pay the speed and memory penalty of carting around the NaN state 
to check later - just throw on the spot whenever something goes 
wrong.