between and among: worth Phobosization?

Tue Dec 17 14:13:11 PST 2013

On Tue, Dec 17, 2013 at 10:29:22PM +0100, Chris Cain wrote:
> On Tuesday, 17 December 2013 at 18:06:27 UTC, Andrei Alexandrescu
> wrote:
> >That's problematic. "Between" is a preposition. Naming a function
> >as a preposition is fine as long as a verb is implied (e.g. "a in
> >b" really means "a _is_ in b", or "a.between(b, c)" really means
> >"a _is_ between b and c" etc).
> 
> I see... interesting. But this doesn't suggest that the concept is
> bad, just the name.
> 
> >"x in between(2, 7)" is cute but just that - it's a lucky strike
> >that relies on word ordering in a particular phrase and is
> >unlikely to work in many other places.
> 
> I agree it's cute and just lucky. I'm not attached to the name, but
> I'd like some sort of name that evokes the purpose like that does
> (as an example of something I wouldn't like reading, `x in iota(2,
> 7)` ...)
> 
> >Reifying "between" to the status of object is weird. One
> >constructs a "between" object and then what are its primitives?
> >How can one even talk about it? "Yeah I have a between here and I
> >copy it to another between"...
> 
> To be honest, I'm just the kind of person to come up with very weird
> ideas, so it's not surprising people might find my idea weird. It
> doesn't necessarily have to be called "between" but some sort of
> object (being able to contain the state is important) could contain
> the concept of a range of values ("inclusive lowerbound", "exclusive
> upperbound", support things like "opIn" or an opCall to test a value
> for membership).

Ah, I see what you're getting at now. I think this idea has a merit on
its own; I'm just not sure if it is useful as an actual intermediate
data *type*.

But, putting that aside, I think the concept does serve its purpose.
It's a pity that the word 'range' already has an assigned meaning in D,
because otherwise that would be the best name in this case (i.e., range
in the mathematical sense of being a contiguous subset of, say, the
number line). So, for the lack of a better name, let's tentatively call
it "Bounds" (as in, the set of elements bounded by upper and lower
bounds), which may be defined, at least conceptually, as:

	struct Bounds(string shape="[]", T,U)
		if (is(T.init < U.init))
	{
		T lower;
		U upper;

		this(T lowerBound, U upperBound)
		{
			lower = lowerBound;
			upper = upperBound;
		}

		bool opBinaryRight(string op)(V val)
			if (op == "in" &&
				is(T.init < V.init) &&
				is(V.init < U.init))
		{
			static if (shape[0] == '[')
				static if (shape[1] == ']')
					return lower <= val <= upper;
				else static if (shape[1] == ')')
					return lower <= val < upper;
				else static assert(0);
			else static if (shape[0] == '(')
				static if (shape[1] == ']')
					return lower < val <= upper;
				else static if (shape[1] == ')')
					return lower < val < upper;
				else static assert(0);
			else static assert(0);
		}
	}

	// And we might as well throw in a convenience function for
	// constructing these things:
	auto bounds(string shape="[]", T,U)(T lower, U upper)
	{
		return Bounds!(shape,T,U)(lower, upper);
	}

	// So here's how you'd use it:
	unittest
	{
		int x = 123;
		assert(x in bounds(122, 124));

		// Mixed types are possible
		ulong y = ulong.max;
		float z = -z.max;
		assert(100 in bounds(y, z));
	}

	// If we want maximum efficiency, and the bounds are statically
	// known, we could also introduce a compile-time variant of
	// Bounds:
	struct CtBounds(T lower, U upper, string shape="[]", T, U)
		if (is(t < u))
	{
		// This struct cannot be instantiated at runtime.
		@disabled this();

		bool opBinaryRight(string op, V)(V val) if (op == "in")
		{
			... // same as above, except that now .lower and
			    // .upper are known at compile-time.
		}
	}

	unittest
	{
		// Now we can check bounds at compile time.
		static assert(1 in CtBounds!(0.0f, 2u));
	}

Of course, we can add opApply to these structs, then you'd be able to
write:

	foreach (x; bounds(0, 100)) { ... }

> It'd also be needed for it to have a simple way to get the smallest
> acceptable type for the range of values the "between" object could
> represent. So a for a Between!(uint, int) that would be a uint, and a
> Between!(int, uint) that would be a long, and so on. Obviously some
> things _don't_ have acceptable types, such as a Between!(long, ulong)
> (no integral type currently can actually hold all of those values).

There's nothing wrong with Bounds!(long,ulong); it just won't have an
opApply method, that's all. :) It can be conveniently static-if'd out in
that case. It can still represent number ranges beyond the current range
of built-in types, like [long.min, ulong.max], and you can test for
membership with various types. This allows you to test variables of
different types, like ints and uints, so the ability to represent such a
range is still useful.

> Something like this, like I showed, could be used to pass to other
> functions like std.random.uniform which request a range to generate.
> Or you should be able to pass it to something like std.algorithm.find,
> std.algorithm.count, etc (predicates that take one parameter).

While you *could* implement the input range API for the Bounds struct
for this purpose, it's probably better to define special overloads for
find and count, since you really don't want to waste time iterating over
elements instead of just directly computing the narrowed Bounds struct
or subtracting the lower bound from the upper, respectively. For
example:

	auto find(T, U, V)(Bounds!(T,U) bounds, V val)
	{
		if (val in bounds)
			return Bounds!(T,U)(val, bounds.upper);
		else
			return Bounds!(T,U)(0, -1); // empty bounds
	}

	size_t count(T, U)(Bounds!(T,U) bounds)
	{
		return bounds.upper - bounds.lower;
	}

I.e., O(1) complexity instead of O(upper-lower).

T

-- 
Question authority. Don't ask why, just do it.