Why Strings as Classes?

Dee Girl deegirl at noreply.com
Thu Aug 28 12:40:51 PDT 2008


Nick Sabalausky Wrote:

> "Steven Schveighoffer" <schveiguy at yahoo.com> wrote in message 
> news:g9650h$cp9$1 at digitalmars.com...
> > "Nick Sabalausky" wrote
> >> "Steven Schveighoffer" wrote in message 
> >> news:g95b89$1jii$1 at digitalmars.com...
> >>>
> >>> When someone sees an index operator the first thought is that it is a 
> >>> quick lookup.
> >>
> >> This seems to be a big part of the disagreement. Personally, I think it's 
> >> insane to look at [] and just assume it's a cheap lookup. That's was true 
> >> in pure C where, as someone else said, it was nothing more than a 
> >> shorthand for a specific pointer arithmentic operation, but carrying that 
> >> idea over to more modern languages (especially one that also uses [] for 
> >> associative arrays) is a big mistake.
> >
> > Perhaps it is a mistake to assume it, but it is a common mistake. And the 
> > expectation is intuitive.
> 
> Why in the world would any halfway competent programmer ever look at a 
> *linked list* and assume that the linked lists's [] (if implemented) is 
> O(1)?

A programmer can look at generic code. Generic code does not know it is a linked list or something else. I think this really helps thinking in generic code. Because generic code that makes problem interesting.

> As for the risk that could create of accidentially sending a linked list to 
> a "search" (ie, a "search for an element which contains data X") that uses 
> [] internally instead of iterators (but then, why wouldn't it just use 
> iterators anyway?): I'll agree that in a case like this there should be some 
> mechanism for automatic choosing of an algorithm, but that mechanism should 
> be at a separate level of abstraction. There would be a function "search" 
> that, through either RTTI or template constraints or something else, says 
> "does collection 'c' implement ConstantTimeForewardDirectionIndexing?" or 
> better yet IMO "does the collection have attribute 
> ForewardDirectionIndexingComplexity that is set equal to 
> Complexity.Constant?", and based on that passes control to either 
> IndexingSearch or IteratorSearch.

I think this is extreme complicate design. What is advantage of this design from STL?

> >  You don't see people making light switches that look like outlets, even 
> > though it's possible.  You might perhaps make a library where opIndex is a 
> > linear search in your list, but I would expect that people would not use 
> > that indexing feature correctly.  Just as if I plug my lamp into the light 
> > switch that looks like an outlet, I'd expect it to get power, and be 
> > confused when it doesn't.  Except the opIndex mistake is more subtle 
> > because I *do* get what I actually want, but I just am not realizing the 
> > cost of it.
> >
> >> The '+' operator means "add". Addition is typically O(1). But vectors can 
> >> be added, and that's an O(n) operation. Should opAdd never be used for 
> >> vectors?
> >
> > As long as it's addition, I have no problem with O(n) operation (and it 
> > depends on your view of n).  Indexing implies speed, look at all other 
> > cases where indexing is used.  For addition to be proportional to the size 
> > of the element is natural and expected.
> >
> 
> When people look at '+' they typically think "integer/float addition". Why 
> would, for example, the risk of mistaking an O(n) "big int" addition for an 
> O(1) integer/float addition be any worse than the risk of mistaking an O(n) 
> linked list "get element at index" for an O(1) array "get element at index"?

This is again wrong for two reasons. I am sorry! One small thing is I think big int "+" is O(log n) not O(n). But real problem is people look at a[] = b[] + c[] and see operands. It is evident from operands that cost is proportional to input size. If it is any shorter it would be miracle! Because it means some elements are not even seen. You compare wrong situations. I mean different situation. And operator or function is not important. I said if array access is index(a, n) and everybody always thinks so then index(a, n) should not do linear search.

> >>> You can force yourself to think differently, but the reality is that 
> >>> most people think that because of the universal usage of square brackets 
> >>> (except for VB, and I feel pity for anyone who needs to use VB) to mean 
> >>> 'lookup by key', and usually this is only useful on objects where the 
> >>> lookup is quick ( < O(n) ).  Although there is no requirement, nor 
> >>> enforcement, the 'quick' contract is expected by the user, no matter how 
> >>> much docs you throw at them.
> >>>
> >>> Look, for instance, at Tango's now-deprecated LinkMap, which uses a 
> >>> linked-list of key/value pairs (copied from Doug Lea's implementation). 
> >>> Nobody in their right mind would use link map because lookups are O(n), 
> >>> and it's just as easy to use a TreeMap or HashMap.  Would you ever use 
> >>> it?
> >>>
> >>>> If you *really* need that sort of guarantee (and I can imagine it may 
> >>>> be useful in some cases), then the implementation of the guarantee does 
> >>>> *not* belong in the realm of "implements vs doesn't-implement a 
> >>>> particular operator overload". Doing so is an abuse of operator 
> >>>> overloading, since operator overloading is there for defining syntactic 
> >>>> sugar, not for acting as a makeshift contract.
> >>>
> >>> I don't think anybody is asking for a guarantee from the compiler or any 
> >>> specific tool.  I think what we are saying is that violating the 
> >>> 'opIndex is fast' notion is bad design because you end up with users 
> >>> thinking they are doing something that's quick.  You end up with people 
> >>> posting benchmarks on your containers saying 'why does python beat the 
> >>> pants off your list implementation?'.  You can say 'hey, it's not meant 
> >>> to be used that way', but then why can the user use it that way?  A 
> >>> better design is to nudge the user into using a correct container for 
> >>> the job by only supporting operations that make sense on the 
> >>> collections.
> >>>
> >>> And as far as operator semantic meaning, D's operators are purposely 
> >>> named after what they are supposed to do.  Notice that the operator for 
> >>> + is opAdd, not opPlus.  This is because opAdd is supposed to mean you 
> >>> are performing an addition operation.  Assigning a different semantic 
> >>> meaning is not disallowed by the compiler, but is considered bad design. 
> >>> opIndex is supposed to be an index function, not a linear search.  It's 
> >>> not called opSearch for a reason.  Sure you can redefine it however you 
> >>> want semantically, but it's considered bad design.  That's all we're 
> >>> saying.
> >>>
> >>
> >> Nobody is suggesting using [] to invoke a search (Although we have talked 
> >> about using [] *in the search function's implementation*). Search means 
> >> you want to get the position of a given element, or in other words, 
> >> "element" -> search -> "key/index". What we're talking about is the 
> >> reverse: getting the element at a given position, ie, "key/index" -> 
> >> [] -> "element". It doesn't matter if it's an array, a linked list, or a 
> >> super-duper-collection-from-2092: that's still indexing, not searching.
> >
> > It is a search, you are searching for the element whose key matches.  When 
> > the key can be used to lookup the element efficiently, we call that 
> > indexing.  Indexing is a more constrained form of searching.
> >
> 
> If you've got a linked list, and you want to get element N, are you *really* 
> going to go reaching for a function named "search"?

Yes. This is exactly STL is doing. And there is no wrong with it. Again I think STL book by Josuttis very helpful. Also Stepanov notes online very interesting! Thank you Don.

> How often do you really 
> see a generic function named "search" or "find" that takes a numeric index 
> as a the "to be found" parameter instead of something to be matched against 
> the element's value? I would argue that that would be confusing for most 
> people.

I think you lose argueing. There is experience in STL that it is not confusing. STL is most success library for C++ even if C++ is now old and has problems. 



More information about the Digitalmars-d mailing list