std.experimental.checkedint is ready for comments!

Wed Jun 15 09:40:19 PDT 2016

> PR: https://github.com/dlang/phobos/pull/4407
> DUB: http://code.dlang.org/packages/checkedint

Thanks for this work. Documentation can be seen here:

http://dtest.thecybershadow.net/artifact/website-7cc6e938154f90aa49fa6451a85b13e35ab2de99-1cfc1d833df5be7c570307da6f7bf5eb/web/phobos-prerelease/std_experimental_checkedint.html

I think there are a few considerable issues with the proposal, but also 
that all are fixable. Here are a few notes:

* The opening documentation states this proposal is concerned with 
adding checking capabilities to integral types. It is a full-blown 
package with 6 modules totaling 4690 lines as wc counts. (For 
comparison: std.experimental.allocator, which offers many facilities and 
implements a number of difficult structures and algorithms, has 11831 
lines.) That's a high budget and a high imposition on users for adding 
checks to integral types. Such extensive style of coding goes against 
the D style we want to promote. Also it is worrisome that one type 
wasn't enough (in spite of extensive parameterization with policies) and 
two are needed, with subtle differences between them that need to be 
summarized in a table. The budget I'd establish for this is one 
parameterized type in one module of manageable size. Several 
parameterized types would be okay if they characterize distinct 
abstractions and use the same backend. Anything else is an indication of 
a design run awry.

* std.experimental.checkedint says "Normally this module should not be 
imported directly." -> this doesn't bode well. I suspect that's also the 
case for std.experimental.checkedint.flags. Modules should be import units.

* Naming: "Safe" has a different meaning throughout D and Phobos. Using 
it here is liable to confuse.

* Naming: The name of a type ("Int") in a name is often a red flag, and 
indeed here it's redundant. One should be able to say Smart!int, not 
stutter SmartInt!int (or Checked!int or whatever). Also, this suggests 
that other types should be considered, how about Smart!bool and 
Smart!double?

* One of the first things I looked for was establishing bounds for 
numbers, like Smart!(int, 0, 100) for percentage. For all its might, 
this package does not offer this basic facility, and from what I can 
tell does not allow users to enforce it via policies.

* Naming: I have no idea what "N bscal;" means.

* Documentation suddenly mentions isFixedPoint without having discussed 
it. Does this package do something about fixed-point numbers?

* Not sure about the value of wrapping bsf, bsr, ilogb and probably 
others. Unlike overflow etc,. calling them with the appropriate values 
is trivial on the user side. A completeness argument could be made but 
all of these wrappers are weight of little value.

* Typos: "equiavlents"

* Documentation uses "you" in several places, colloquialism that should 
be avoided.

* The idea of a loose federation of smart operators that sanitize the 
result of usual arithmetic operators is valuable. The idea of using 
different imports for the same name depending on design is clever, but 
can be simplified. Define functions such as enforcedOp, assertedOp (you 
don't need "unary" and "binary", they can be overloads), and then:

import std.checkedint : smartOp = enforcedOp;
...
auto x = smartOp!"+"(a, b);

* Not sure why divPow2 is needed, why not some enforcedOp!"<<" etc. Same 
about pow, why not enforcedOp!"^^"?

* Getting to the design: the root of the problem is a byzantine design 
that is closed to extension. The current definition goes:

struct SafeInt(N, IntFlagPolicy _policy, Flag!"bitOps" bitOps = Yes.bitOps);

Looking at the IntFlagPolicy, it offers three canned behavior: throws, 
asserts, and noex. Users cannot customize behavior and there is no 
information passed into the policy (e.g. the operands in case of 
overflow, or the numerator in case of division by zero). Passing the 
appropriate information would make it possible to implement various 
policies, e.g. the policy cannot say "I'm actually okay with this" or 
"here I'll implement a saturation policy".

A better design would use the policy as a last-resort hook, allowing it 
to substitute its own answer for the result if it can't be computed 
safely. For example, consider addition:

     bool overflow;
     auto result = myAdd(payload, rhs.payload, overflow);
     if (!overflow) return result;
     return onOverflow(payload, rhs.payload);

The onOverflow function is a policy, which allows user code to exactly 
define what should happen (including all current policies). The same 
goes about all other hooks - multiplication, division, power etc. Each 
should be configurable.

* I see little value in free functions such as e.g. abs() because they 
are trivial one-liners. I understand the need for completeness, but it 
seems a good aspiration for consistency is being marred by a bunch of 
code pulp that really does nothing interesting. Probably not worth it.

===

This suggests a much simpler design that uses DbI to choose behaviors in 
a flexible manner. The definition would go:

struct Checked(T, Hook, T min = T.min, T max = T.max) if (...)
{
     // state {
     static if (stateSize!Hook > 0) private Hook hook;
     else private alias hook = Hook;
     private T payload;
     // }
     ...
}

The struct Hook may have no member at all, in which case Checked!T will 
be as close to plain T as possible. Things become interesting when Hook 
defines optional methods that can be detected by introspection, such as 
onOverflow, onDiv0, onOutOfRangeCmp, etc. - essentially all hooks that 
may be ever needed. Whenever an operation cannot complete with the right 
behavior, the appropriate hook is statically introspected; if 
nonexistent, fine, return with the builtin behavior. If the hook does 
exist, pass responsibility to the hook along with enough information to 
actually let the hook supplant the computation.

The hook may have state (e.g. hysteresis, NaN flag, error state, etc) so 
that's why it may be embedded within Checked.

The only remaining matter is to implement a few preexisting policies 
(Hook implementations) to implement typical choices (such as the ones 
present today), and the core algorithms for doing bounded operations. 
The most interesting algorithms are for computing the bounds of 
operations such as |, &, and ^. The D compiler needs those as well, and 
currently implements them incorrectly. I have these in my mind and I can 
help with those.

Thanks,

Andrei