Why Strings as Classes?

Thu Aug 28 12:19:20 PDT 2008

"Steven Schveighoffer" <schveiguy at yahoo.com> wrote in message 
news:g9650h$cp9$1 at digitalmars.com...
> "Nick Sabalausky" wrote
>> "Steven Schveighoffer" wrote in message 
>> news:g95b89$1jii$1 at digitalmars.com...
>>>
>>> When someone sees an index operator the first thought is that it is a 
>>> quick lookup.
>>
>> This seems to be a big part of the disagreement. Personally, I think it's 
>> insane to look at [] and just assume it's a cheap lookup. That's was true 
>> in pure C where, as someone else said, it was nothing more than a 
>> shorthand for a specific pointer arithmentic operation, but carrying that 
>> idea over to more modern languages (especially one that also uses [] for 
>> associative arrays) is a big mistake.
>
> Perhaps it is a mistake to assume it, but it is a common mistake. And the 
> expectation is intuitive.

Why in the world would any halfway competent programmer ever look at a 
*linked list* and assume that the linked lists's [] (if implemented) is 
O(1)?

As for the risk that could create of accidentially sending a linked list to 
a "search" (ie, a "search for an element which contains data X") that uses 
[] internally instead of iterators (but then, why wouldn't it just use 
iterators anyway?): I'll agree that in a case like this there should be some 
mechanism for automatic choosing of an algorithm, but that mechanism should 
be at a separate level of abstraction. There would be a function "search" 
that, through either RTTI or template constraints or something else, says 
"does collection 'c' implement ConstantTimeForewardDirectionIndexing?" or 
better yet IMO "does the collection have attribute 
ForewardDirectionIndexingComplexity that is set equal to 
Complexity.Constant?", and based on that passes control to either 
IndexingSearch or IteratorSearch.

>  You don't see people making light switches that look like outlets, even 
> though it's possible.  You might perhaps make a library where opIndex is a 
> linear search in your list, but I would expect that people would not use 
> that indexing feature correctly.  Just as if I plug my lamp into the light 
> switch that looks like an outlet, I'd expect it to get power, and be 
> confused when it doesn't.  Except the opIndex mistake is more subtle 
> because I *do* get what I actually want, but I just am not realizing the 
> cost of it.
>
>> The '+' operator means "add". Addition is typically O(1). But vectors can 
>> be added, and that's an O(n) operation. Should opAdd never be used for 
>> vectors?
>
> As long as it's addition, I have no problem with O(n) operation (and it 
> depends on your view of n).  Indexing implies speed, look at all other 
> cases where indexing is used.  For addition to be proportional to the size 
> of the element is natural and expected.
>

When people look at '+' they typically think "integer/float addition". Why 
would, for example, the risk of mistaking an O(n) "big int" addition for an 
O(1) integer/float addition be any worse than the risk of mistaking an O(n) 
linked list "get element at index" for an O(1) array "get element at index"?

>>> You can force yourself to think differently, but the reality is that 
>>> most people think that because of the universal usage of square brackets 
>>> (except for VB, and I feel pity for anyone who needs to use VB) to mean 
>>> 'lookup by key', and usually this is only useful on objects where the 
>>> lookup is quick ( < O(n) ).  Although there is no requirement, nor 
>>> enforcement, the 'quick' contract is expected by the user, no matter how 
>>> much docs you throw at them.
>>>
>>> Look, for instance, at Tango's now-deprecated LinkMap, which uses a 
>>> linked-list of key/value pairs (copied from Doug Lea's implementation). 
>>> Nobody in their right mind would use link map because lookups are O(n), 
>>> and it's just as easy to use a TreeMap or HashMap.  Would you ever use 
>>> it?
>>>
>>>> If you *really* need that sort of guarantee (and I can imagine it may 
>>>> be useful in some cases), then the implementation of the guarantee does 
>>>> *not* belong in the realm of "implements vs doesn't-implement a 
>>>> particular operator overload". Doing so is an abuse of operator 
>>>> overloading, since operator overloading is there for defining syntactic 
>>>> sugar, not for acting as a makeshift contract.
>>>
>>> I don't think anybody is asking for a guarantee from the compiler or any 
>>> specific tool.  I think what we are saying is that violating the 
>>> 'opIndex is fast' notion is bad design because you end up with users 
>>> thinking they are doing something that's quick.  You end up with people 
>>> posting benchmarks on your containers saying 'why does python beat the 
>>> pants off your list implementation?'.  You can say 'hey, it's not meant 
>>> to be used that way', but then why can the user use it that way?  A 
>>> better design is to nudge the user into using a correct container for 
>>> the job by only supporting operations that make sense on the 
>>> collections.
>>>
>>> And as far as operator semantic meaning, D's operators are purposely 
>>> named after what they are supposed to do.  Notice that the operator for 
>>> + is opAdd, not opPlus.  This is because opAdd is supposed to mean you 
>>> are performing an addition operation.  Assigning a different semantic 
>>> meaning is not disallowed by the compiler, but is considered bad design. 
>>> opIndex is supposed to be an index function, not a linear search.  It's 
>>> not called opSearch for a reason.  Sure you can redefine it however you 
>>> want semantically, but it's considered bad design.  That's all we're 
>>> saying.
>>>
>>
>> Nobody is suggesting using [] to invoke a search (Although we have talked 
>> about using [] *in the search function's implementation*). Search means 
>> you want to get the position of a given element, or in other words, 
>> "element" -> search -> "key/index". What we're talking about is the 
>> reverse: getting the element at a given position, ie, "key/index" -> 
>> [] -> "element". It doesn't matter if it's an array, a linked list, or a 
>> super-duper-collection-from-2092: that's still indexing, not searching.
>
> It is a search, you are searching for the element whose key matches.  When 
> the key can be used to lookup the element efficiently, we call that 
> indexing.  Indexing is a more constrained form of searching.
>

If you've got a linked list, and you want to get element N, are you *really* 
going to go reaching for a function named "search"? How often do you really 
see a generic function named "search" or "find" that takes a numeric index 
as a the "to be found" parameter instead of something to be matched against 
the element's value? I would argue that that would be confusing for most 
people. Like I said in a different post farther down, the implementation of 
a "getAtIndex()" is obviously going to work like a search, but from "outside 
the box", what you're asking for is not the same.