between and among: worth Phobosization?

Tue Dec 17 17:27:35 PST 2013

On Tuesday, 17 December 2013 at 22:14:41 UTC, H. S. Teoh wrote:
> Ah, I see what you're getting at now. I think this idea has a 
> merit on
> its own; I'm just not sure if it is useful as an actual 
> intermediate
> data *type*.

The use over a function would be

1. Contain all of the complexity that working with intervals on 
(in this case) integers. It's been shown enough times that the 
straight-forward way of dealing with this is error-prone.
2. Maintain performance characteristics as much as possible. 
Without an object, a function doing this sort of thing would have 
to revalidate the bounds each time or, worse, NOT validate the 
bounds at all (with in contracts, we can validate each time 
because release code will take the contracts out, but it's still 
potentially an issue). With an object we can cache any type of 
validations and/or assertions needed and, potentially, improve 
performance in some cases.
3. Allow for existing functions to specialize when an interval is 
given, when appropriate.

> But, putting that aside, I think the concept does serve its 
> purpose.
> It's a pity that the word 'range' already has an assigned 
> meaning in D,
> because otherwise that would be the best name in this case 
> (i.e., range
> in the mathematical sense of being a contiguous subset of, say, 
> the
> number line). So, for the lack of a better name, let's 
> tentatively call
> it "Bounds" (as in, the set of elements bounded by upper and 
> lower
> bounds), which may be defined, at least conceptually, as:

Just to step up your idea to something a bit closer to complete 
(still have not thrown it into a compiler or anything yet):
http://www.dpaste.dzfl.pl/19c80ff9

(And I really like the idea of a CtInterval, but haven't done 
anything with it so I've excluded it in the above paste)

>> It'd also be needed for it to have a simple way to get the 
>> smallest
>> acceptable type for the range of values the "between" object 
>> could
>> represent. So a for a Between!(uint, int) that would be a 
>> uint, and a
>> Between!(int, uint) that would be a long, and so on. Obviously 
>> some
>> things _don't_ have acceptable types, such as a Between!(long, 
>> ulong)
>> (no integral type currently can actually hold all of those 
>> values).
>
> There's nothing wrong with Bounds!(long,ulong); it just won't 
> have an
> opApply method, that's all. :) It can be conveniently 
> static-if'd out in
> that case. It can still represent number ranges beyond the 
> current range
> of built-in types, like [long.min, ulong.max], and you can test 
> for
> membership with various types. This allows you to test 
> variables of
> different types, like ints and uints, so the ability to 
> represent such a
> range is still useful.

Well, I'm not suggesting that the interval not be allowed... but 
for things that use that interval, they may produce some sort of 
output. If they're using the interval to output, then they'll 
need to know what data type the output needs to be. It'd be 
convenient if some standard function existed to accomplish that 
task in a standard way.

The example I'm using for this is if std.random.uniform took in 
an interval, what would its output be? Obviously, it couldn't 
output something in [long.min, ulong.max], but it's possible it 
could spit out an answer in [byte.min, ubyte.max] since a short 
could contain all of those values.

>> Something like this, like I showed, could be used to pass to 
>> other
>> functions like std.random.uniform which request a range to 
>> generate.
>> Or you should be able to pass it to something like 
>> std.algorithm.find,
>> std.algorithm.count, etc (predicates that take one parameter).
>
> While you *could* implement the input range API for the Bounds 
> struct
> for this purpose, it's probably better to define special 
> overloads for
> find and count, since you really don't want to waste time 
> iterating over
> elements instead of just directly computing the narrowed Bounds 
> struct
> or subtracting the lower bound from the upper, respectively. For
> example:

Sorry, confusion based on using the word "range" again. When I 
said range, I meant bounds/interval in this case. Functions that 
request some sort of interval or bounds should use interval 
instead of trying to do anything on its own (since the "do your 
own thing" is increasingly being found to be errorprone).

So, something like this should work:

     unittest
     {
         import std.algorithm;
         assert(
             find!"a in b"([5, 6, 2, 9], interval(1, 4))
                 == [2, 9]);
         // uses std.algorithm.find

         assert(
             count!"a in b"([5, 6, 1, 3, 9, 7, 2], interval(1,3))
                 == 3);
         // uses std.algorithm.count

         import std.random;
         foreach(_; 0..10000)
             assert(uniform(interval(1,5)) in interval(1,5));
         // Nice assertion, right?
     }

It might also be useful in some circumstances to be able to know 
how many values are in the interval (sort of like a "length" or 
"size") but if you have an interval of [long.min, ulong.max] ... 
well, you know the problem.

Considering what Andrei said, we might could expand this concept 
to support the interval arithmetic. We'd also need to be able to 
support intervals like (-oo, oo), (-oo, x], [x, oo) ... where the 
membership test returns true, <=x, and >=x respectively (while 
taking care of the issues that exist with signed/unsigned 
comparisons, obviously). That said, not all functions will want 
to handle those types of intervals (std.random.uniform, for 
instance).