Checked integer type API design poll

tsbockman via Digitalmars-d digitalmars-d at puremagic.com
Fri Sep 18 05:40:12 PDT 2015


This is in reply to the following comment which someone left on 
PollJunkie:

> I only need signed checked types and unsigned unchecked types.
> Regarding conversion from signed to unsigned, NaN should be 
> mapped
> to T.max (thus neither garbage nor throwing an exception). And 
> bitwise
> operators aren't very useful on signed types - whatever you 
> want do do
> with them can be done on unsigned values and then casted to 
> signed if
> that makes any sense at all."

First off, thanks to you and everyone else who took the time to 
fill out the poll. The feedback is helpful, and I will likely 
modify the design of `CheckedInt` in response.

For bitwise operators, it is clear from this poll, and the 
previous discussion 
(http://forum.dlang.org/thread/mfbsfkbkczrvtaqssbip@forum.dlang.org), that there is a fairly even split between people who are strongly in favour and people who are strongly opposed. We'd better just make them optional, or perhaps hidden; I will give some thought to the best way to do this.

With respect to mapping NaN to T.max (or T.min for signed types), 
this general idea has been part of Robert Schadek's work from the 
beginning; my version makes some use of it as well.

T.max (unsigned) and T.min (signed) are both good candidates for 
a sentinel value; the question is, is a NaN sentinel value really 
the right solution for the public API?

Pros:

* As an internal storage format for a `CheckedInt` type, the use 
of a sentinel value to represent NaN is very memory efficient, 
which is good for arrays and such. This will almost certainly be 
a part of the final design.

* A value like T.max would tend to stand out to an experienced 
programmer during debugging.

Cons:

* All it takes is a single unchecked arithmetic operation - say, 
`++` - and suddenly the sentinel value is gone, turned into 
garbage which may be very hard to distinguish from legitimate 
data at a glance.

* Sentinel values are not type-safe. In general, there is no way 
for an algorithm which accepts an unchecked `uint` as an input to 
tell if `uint.max` means "NaN" or "4294967295". An algorithm 
which needs to be NaN-aware should use a checked type, instead.

Of course, if you always immediately manually check the result of 
casts against the sentinel value, you would be OK. But then, why 
not maintain type safety by just manually checking 
`CheckedInt.isNaN` before casting?

In light of all this, I believe that guaranteeing the return of a 
sentinel value from failed casts is not worth the trouble, versus 
just making it undefined behaviour (returning garbage). Either 
way, the difference in both safety and performance is trivial, so 
I won't argue about this point if many others disagree.

Throwing an exception, on the other hand, has a clear benefit in 
preventing silent bugs; albeit one that is perhaps not worth the 
moderate performance cost and mandatory GC use.

As I said in my first post, even if exceptions are a part of the 
final design, it will certainly still be possible to avoid them 
when desired. Moreover, exceptions only naturally come up when 
*mixing* checked and unchecked types; neither exceptions, nor 
frequent explicit `isNaN` checks are required as long as code is 
*consistent* about using only checked integer types (or 
floating-point).

On the other hand, if we *don't* include exceptions in the API, 
then it becomes impossible to use the checked integer types in a 
way that is verified to be safe by the compiler - particularly if 
the silently-failing cast is made implicit.


More information about the Digitalmars-d mailing list