[Issue 1323] Implement opIn_r for arrays

Mon Jul 9 12:04:46 PDT 2007

http://d.puremagic.com/issues/show_bug.cgi?id=1323

------- Comment #4 from wbaxter at gmail.com  2007-07-09 14:04 -------
(In reply to comment #1)
> It's been said before, no harm in repeating it here.
> 
> For associative arrays,
> 
>         X in A
> 
> means that there exists a Y for which
> 
>         A[X] == Y
> 
> According to the semantics that get proposed every time this is brought up, for
> linear arrays, it would mean the reverse: there exists a Y for which
> 
>         A[Y] == X
> 
> This inconsistency might be undesirable from a generic programming POV.

I can't see how.  Arrays and associative arrays are different beasts.  You
can't really generically interchange the two, regardless of what the behavior
of 'in' is, because of other semantic differences.  And note that currently
they certainly behave very differently regarding how 'in' works, and there
hasn't been a meltdown because of the lack of ability to generically
interchange arrays and associative arrays on account of that.  

In Python since everything is dynamic and uses 'duck typing' it's really got
more to lose from loss of genericity than D.  In Python you're effectively
*always* writing generic template code.  And yet in Python "foo in alist"
checks if foo is in the alist, and "foo in aa" checks if foo is a key in aa (a
'dict' in Python).  And there has been no outcry over this as far as I know. 
Probably because you usually know before you start coding whether you need a
dict or an array.  It's pretty obvious that if I want to look up people by
their employee ID's that the ID shouldn't be the index in an regular array, an
associative array would be better.

> Besides, if we're going to have an operator to check whether a specified
> _value_ is in an array, shouldn't it be available for AAs as well?

Yes, it would be spelled "foo in AA.values".

Ok, that wouldn't be so efficient since it builds a big temporary array, but
that's a problem with AA.values.  AA's need some keys() and values()-type
methods that return lightweight iterators.  But the iterators situation in D is
a whole 'nother mess.

Arguing from another direction, there's only one 'opIn_r', and that's not
likely to change, so which version of opIn_r is going to save you more work if
implemented for arrays?  An opIn_r for arrays that does (0<=k && k<A.length) or
one which does 

  bool found = false;
  foreach (tmp; A) { if (tmp==k) { found=true; break; } }

The former is already an expression, and so can be used in an if condition, for
example; the latter is a) more typing and b) not an expression.  If you want to
use it as an expression you need to do more typing to wrap it in an function.  

I'd say the former is already convenient enough (and you can even get away with
just k<A.length if k is unsigned).  And really the actual implementation of the
latter should probably be a little more complex to dance around the ticking
time-bomb behavior of == for classes.  Something like:

bool contains(T[] A, T k) {
  bool found = false;
  foreach (v; A) {
    static if (is(T==class)) {
       if (v is k || (k !is null && k==v)) { found=true; break; }
    }
    else {
       if (v==k) { found=true; break; }
    }
  }
  return found;
}

And maybe I've missed something there too.  Like maybe the v should be 'ref'
for better efficiency when it's a big struct type?  Probably it shouldn't be
ref for class types though, so maybe I need another static if there.  Anyway,
that's part of why I want it taken care of for me in the language.  I just want
to type 'x in some_array' and let the system find it for me in the most
efficient way possible (which may include some sort of fancy parallel scanning
in the future).

--