std.experimental.checkedint is ready for comments!
tsbockman via Digitalmars-d
digitalmars-d at puremagic.com
Wed Jun 15 11:34:15 PDT 2016
On Wednesday, 15 June 2016 at 16:40:19 UTC, Andrei Alexandrescu
wrote:
> Thanks for this work.
> [...]
> I think there are a few considerable issues with the proposal,
> but also that all are fixable.
Your message was very long, so for the moment I'm going to filter
it down to just the high-level design criticism. (The rest is
unimportant unless/until we reach consensus on the design,
anyway.)
> * The opening documentation states this proposal is concerned
> with adding checking capabilities to integral types. It is a
> full-blown package with 6 modules totaling 4690 lines as wc
> counts. (For comparison: std.experimental.allocator, which
> offers many facilities and implements a number of difficult
> structures and algorithms, has 11831 lines.) That's a high
> budget and a high imposition on users for adding checks to
> integral types. Such extensive style of coding goes against the
> D style we want to promote.
> [...]
> The budget I'd establish for this is one parameterized type in
> one module of manageable size. Several parameterized types would
> be okay if they characterize distinct abstractions and use the
> same backend. Anything else is an indication of a design run
> awry.
This can be summarized as, "It's too big and complicated."
`checkedint` as it stands is, I believe, fairly close to the
smallest implementation possible in D today, within the
constraints of the features demanded by the community in past
discussions, coupled with my high-level design goals.
If you want something shorter or simpler, you will have to cut
features or compromise the design in other ways. (Or improve the
D language to facilitate a more elegant design.)
Some features and design goals that combine to motivate my design:
1) Checked types must signal an error (somehow) whenever their
behaviour
deviates from that of an ideal mathematical integer.
2) It should be possible to recover from errors - using
`assert(false)` or
a deliberate divide-by-zero to crash the program is bad
design unless
the condition that triggers it is never supposed to happen,
ever.
3) Performance (with respect to both speed and memory use)
should be as
close as possible to that of the built-in machine integer
types.
4) The API should minimize verbosity and ceremony, because
otherwise hardly
anyone will use it - people generally prefer convenience
over safety.
5) Writing generic code that works correctly with both checked
and unchecked
types must be easy.
6) The API must make safe usage easy, and (accidental) unsafe
usage hard,
because people generally don't pay much attention to the
docs (even if
they're good). A false sense of security is worse than none
at all.
7) The API must be usable in `nothrow @nogc` code.
8) The number of distinct template instantiations generated in
natural use
must be finite, to prevent excessive combinatorial
explosion when
checked types are used in public APIs. (Templates that are
just aliases,
and small functions that always inline don't count against
this.)
> Also it is worrisome that one type wasn't enough (in spite of
> extensive
> parameterization with policies) and two are needed, with subtle
> differences
> between them that need to be summarized in a table.
The reason for the `SmartInt` versus `SafeInt` split is that with
D's current semantics
(4) and (5) conflict.
`SmartInt` prioritizes (4); `SafeInt` and `DebugInt` prioritize
(5).
> Getting to the design: the root of the problem is a byzantine
> design that is closed to extension.
The design was closed deliberately because of (8). Template bloat
is a major concern, even with the current finite design.
I want `checkedint` to be usable in public APIs, and that
requires some standardization of error handling and base types to
be enforced upon the users. Otherwise, everyone will choose
something different and all template instantiations involving
integer types will become practically single-use.
> Looking at the IntFlagPolicy, it offers three canned behavior:
> throws, asserts, and noex.
The choice of policies is motivated by the natural
incompatibility of (2), (4), (6), and (7). I built in enough
variety to allow people to choose their own priorities among
those goals, and no more because of (8).
> * One of the first things I looked for was establishing bounds
> for numbers, like Smart!(int, 0, 100) for percentage. For all
> its might, this package does not offer this basic facility, and
> from what I can tell does not allow users to enforce it via
> policies.
Here you are suggesting adding even more complexity to a design
that you have already de-facto rejected as overly complex. As
discussed earlier in this very thread, I studied adding support
for arbitrary bounds and decided not to pursue that right now
because implementing it in a performant way would greatly
increase the complexity of `checkedint`, and make the template
bloat problem much worse.
> Also, this suggests that other types should be considered, how
> about Smart!bool and Smart!double?
Neither `bool` nor `double` has the kind of severe-but-fixable
safety and correctness issues that the machine integer types do,
which motivates the `checkedint` design. No doubt someone can
make up some sort of meaning to attach to those symbols, but it
will most likely have nothing to do with `SmartInt`.
> * Not sure why divPow2 is needed, why not some enforcedOp!"<<"
Because (although similar) a bit shift is actually semantically
different than dividing or multiplying by a power of two. The bit
shift implies different rules for rounding and overflow.
I realized in testing that even for `SmartInt`, the bit shift
semantics are still useful sometimes, and decided that it was
better not to confuse the two. A `smartOp` version of the bit
shifts is necessary because the built-in bit shifts have some
undefined behaviour that needs to be fixed.
> etc. Same about pow, why not enforcedOp!"^^"?
`pow()` exists as a free function to satisfy (5), and because the
`^^` and `^^=` operators both have language-level bugs that
currently make their use incompatible with (6).
> I see little value in free functions such as e.g. abs() because
> they are trivial one-liners. I understand the need for
> completeness, but it seems a good aspiration for consistency is
> being marred by a bunch of code pulp that really does nothing
> interesting. Probably not worth it.
`checkedint`-aware versions of functions like `abs()` are
necessary to satisfy (4) and (6) together.
> The hook may have state (e.g. hysteresis, NaN flag, error
> state, etc) so that's why it may be embedded within Checked.
The early iterations of `checkedint` worked this way (although I
had no plans to support user-defined hooks). I implemented and
debugged it, and thought it was about ready to submit many months
ago.
Then I actually tried *using* it, and hated it. The problem with
using a NaN state, is that in nothrow code you have to manually
check it before *every* call to an external
(non-`checkedint`-aware) function, or you may accidentally lose
it.
As a result, safe NaN-based APIs violate (4), while concise APIs
violate (6). The IEEE-inspired sticky flags feature is my
solution to this problem, and it is far more pleasant to work
with in practice - as well as faster and more memory efficient.
Meanwhile, in code that allows exceptions, there is no reason to
pay the speed and memory penalty of carting around the NaN state
to check later - just throw on the spot whenever something goes
wrong.
More information about the Digitalmars-d
mailing list