boolean over multiple variables

Tue Jan 26 18:37:37 PST 2010

On Tue, Jan 26, 2010 at 6:16 PM, Nick Sabalausky <a at a.a> wrote:
>>"Bill Baxter" <wbaxter at gmail.com> wrote in message
>>news:mailman.34.1264542189.4461.digitalmars-d-learn at puremagic.com...
>>On Tue, Jan 26, 2010 at 1:21 PM, bearophile <bearophileHUGS at lycos.com>
>>wrote:
>>> Nick Sabalausky:
>>>> Aside from that being how Python does it, why do you see that as
>>>> preferable?
>>>
>>> Because:
>>> 1) linear searches in an array are damn common. I don't remember the
>>> results of my benchmarks, but until your integer arrays is quite longer
>>> than 30-50 items, performing a linear search is faster than a lookup in
>>> an AA, on DMD. On Tango this number is probably 70% higher
>>> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't
>>> perform a linear search, it uses a much smarter search that has a
>>> complexity lower than the product of the two lengths, using a custom
>>> algorithm. So in D you can use the same syntax to search for
>>> substrings/subarrays. Where such smarter search is not possible, D can
>>> use a naive search.
>>> 2) It's really handy. I use isIn(item, items) to search on arrays in D,
>>> but having a item in items is nicer.
>>> 3) You can use the same syntax to search into anything that's lazily
>>> iterable too (a Range). This is very handy.
>>>
>>>
>>>> So having a single syntax work on the outputs for
>>>> regular arrays, but then on the inputs for AAs, seems highly
>>>> inconsistent
>>>> and error-prone to me.
>>>
>>> I have followed many Python newbies personally, I am following the Python
>>> newsgroups, and I have programmed for years in Python, and while I have
>>> seen many different kinds of bugs, I have not seen a significant amount
>>> of bugs in this. Python programmers just learn that dicts and lists are a
>>> little different in this regard. At the same way they learn that a set
>>> and a dict are different data structures, with different capabilities and
>>> usages.
>>
>>It's not even really  inconsistent if you just think about these data
>>structures in terms of function rather than form.
>>An array is often used as a simple set of things.  "O in Array" means
>>"is O in that set of things"
>>An AA is a set of things that also have some associated data.  "O in
>>AA" means "is O in that set of things" (not the ancillary data)
>>If you have an actual "set" data structure for containing a set of of
>>things, then "O in Set" means, again, "is O in that set of things".
>>(In fact the closest thing D has to a built-in set type is an AA with
>>"don't care" associated data, reinforcing the notion of AA as a set
>>plus extra data.)
>>
>>
>
> Even looking at function rather than form, I still think its innacurate to
> consider the keys to be the elements of an AA. In most uses of an AA, the
> key is primarily something convenient with which to look up data. They hold
> significance, but typically not as much as the data that is looked up with
> it. What you've described is very much like (and quite literally the same
> as, in the case of many dynamic languages) thinking of a variable's name as
> the actual data, and thinking of the value it holds merely as "ancillary
> data".
>
> Keep in mind too, even with a regular array, the index can still hold
> significance as data. For instace, I could have an array of Foo's, declare
> that any element with an odd index has property 'A' and any with an even
> index has property 'B', and treat them as such. May seem strange at a
> glance, but such things are common in low-level, low-resoruce and
> performance-oriented code. Bottom line, though, is that "Property 'A' or
> 'B'" is data that now been encoded in the array's index, but despite that,
> the indicies still aren't considered the array's elements. And the data they
> lookup still isn't considered "ancillary data".
>
> And yes, a Hashed Set may likely be *implemented* as an AA with just keys,
> but that's just form, it doesn't imply a similarity in regard to function.
> The *function* of a HashSet is that of an unordered array that's been
> optimized for "contains X / doesn't contain X" on large data sets.

> Containers and their functions:
> - AA: Store A's with label B, A is fairly important, B may or may not be.
> - Array: Store A's with label B, A is fairly important, B may or may not be,
> B has certain restrictions, container overall has different performance
> characteristacs from an AA.
> - Hashed Set: Store A's.

All I am trying to say is that there are multiple ways of looking at
the functionality offered by different containers.
And there exists a way of looking at it where the Python-style 'in'
operator can be seen as behaving consistently.

You asserted it is inconsistent.  I'm just saying it's only
inconsistent if you insist that one particular way of looking at the
containers is the "right" way.

--bb