Voting for std.experimental.checkedint

Sun Feb 26 13:15:13 PST 2017

On 2/26/17 4:53 AM, Seb wrote:
> On Sunday, 26 February 2017 at 09:41:46 UTC, rumbu wrote:
>> On Saturday, 25 February 2017 at 15:21:10 UTC, Andrei Alexandrescu wrote:
>>> On 02/25/2017 10:17 AM, rumbu wrote:
>>>> A lot of bloat code for something extremely basic.
>>>
>>> If you can do it with less code, I'm all ears. Thanks! -- Andrei
>>
>> This was not about coding skills, was about usability. The module
>> contains too many options and failure scenarios instead of a simple
>> default behavior.
>>
>> Considering that in most languages with integrated overflow checking,
>> the default behavior is throwing some kind of exception (Ada, C#,
>> Pascal, Rust, Swift)
>
> If you want a module with a lot less features, the low-level
> core.checkedint might be interesting for you:
>
> http://dlang.org/phobos/core_checkedint.html

Thanks for making this point. I agree with the sentiment "you mean I 
need a 1 KLOC library just to check a handful of operations?" This 
paradox is very interesting and worth looking into.

(BTW the number of lines as dscanner --sloc counts is 1261.)

Indeed, the routines in core.checkedint are everything needed (in 
addition to some inline code for comparisons) if the purpose is to check 
operations individually. However, if the intent is to check for errors 
systematically for certain values or program fragments, that doesn't 
scale; before long, the code becomes a bloatfest. Not to mention the 
difficulty in making sure that all operations of interest have been, in 
fact, covered.

So the next logical step is to attempt encapsulation of these checks in 
a type. Here is where one way or another the code bulk must increase, 
and the key question here is how much ability to customize you get per 
unit of code increase.

One issue with checked integers in general, and as a standard (i.e. 
highly reusable) library in particular, is that they are quite project 
specific: what to do upon violation, and which operations to verify and 
which to let run at full speed. As soon as a library does something even 
slightly different from what's necessary, the usability and efficiency 
margins are so narrow, you need to throw the library away and write your 
own. This is very opposite from, say, writing a sorting algorithm 
wherein the API design is very narrow and the difficulty is in the 
algorithm itself. So if you want to write a highly reusable checkedint 
library, you must put ability to customize front, left, and center.

I've started work on an article on DbI, and did a little research on 
other libraries. I found these:

* Mozilla's CheckedInt: 
https://hg.mozilla.org/mozilla-central/file/tip/mfbt/CheckedInt.h, 
clocking at only 791 LoC (no docs and unittests). Though compact and 
ingenious, it makes two design decisions that I think are problematic: 
(a) it stores a "valid" bit (which costs an actual word) together with 
the integral value 
(https://hg.mozilla.org/mozilla-central/file/tip/mfbt/CheckedInt.h#l503), 
which leads to an inefficient layout and also puts all enforcement onus 
on the user; and (b) it separates overflow checks from the actual 
operations, which leads to bulky and inefficient overflow checks (see 
e.g. 
https://hg.mozilla.org/mozilla-central/file/tip/mfbt/CheckedInt.h#l256 
for addition).

* https://safeint.codeplex.com by Microsoft - a behemoth of a library 
clocking at 7055 LoC including comments. Speed is an explicit goal. It 
makes a number of design decisions that might not work for everyone, for 
example:

- accepts (somewhat obliquely) implicit conversion back to the 
representation type, which is kind of defeating the purpose

- taking the address decays to a pointer to unchecked integral (what?)

- has a rigid error policy (either assert or throw)

- the checks and the error handling policies are awkwardly controlled 
via command line instead of template parameters

- binary operators don't work against two SafeInts

- signed/unsigned comparisons are not checked (this is a consequence of 
the implicit decay)

* https://github.com/robertramey/safe_numerics, meant as an addition to 
Boost. That's also a large lbrary (4969 lines with light comments, going 
up to over 10K lines with unittests, and requiring 6 other Boost 
libraries: MPL, Integer, Config, Concept Checking, Tribool, and 
Enable_if). The author also wrote a recent article (Overload Feb 2017) 
that describes the library: 
http://www.rrsd.com/software_development/safe_numerics/Overload137.pdf. 
The article does a great job at motivating such libraries. The facility 
allows good error policy customization, and allows to some extent 
customizing the checks being done (only for promotions). It also has a 
mode that is at least theoretically interesting - it expands the result 
of operations whenever possible to preserve correctness, and refuses to 
compile code that might overflow. I speculate that that feature is of 
very limited use; in just a couple of steps everything goes to 64 bits, 
and we're done. The implementation has the usual genuflections one would 
expect, for example:

    template<class T, class U>
     using calculate_max_t =
         typename boost::mpl::if_c<
             // clause 1 - if both operands have the same sign
             std::numeric_limits<T>::is_signed
             == std::numeric_limits<U>::is_signed,
             // use that sign
             typename boost::mpl::if_c<
                 std::numeric_limits<T>::is_signed,
                 std::intmax_t,
                 std::uintmax_t
             >::type,
         // clause 2 - otherwise if the rank of the unsigned type exceeds
         // the rank of the of the maximum signed type
         typename boost::mpl::if_c<
             (rank< select_unsigned<T, U>>::value
             > rank< std::intmax_t >::value),
             // use unsigned type
             std::uintmax_t,
         // clause 3 - otherwise if the type of the signed integer type can
         // represent all the values of the unsigned type
         typename boost::mpl::if_c<
             std::numeric_limits< std::intmax_t >::digits >=
             std::numeric_limits< select_unsigned<T, U> >::digits,
             // use signed type
             std::intmax_t,
         // clause 4 - otherwise use unsigned version of the signed type
             std::uintmax_t
         >::type >::type >::type;

* https://code.dlang.org/packages/checkedint by Thomas Stuart Bockman. I 
might be biased but it compares favorably against all of the above. A 
major goal of std.experimental.checkedint was to allow more 
customization in a smaller package.

Andrei