Why Strings as Classes?

Wed Aug 27 09:15:21 PDT 2008

Michiel Helvensteijn Wrote:

> Dee Girl wrote:
> 
> >> What if you maintain a linked list of small arrays? Say each node in the
> >> list contains around log(n) of the elements in the entire list. Wouldn't
> >> that bring random access down to O(log n)? Of course, this would also
> >> bring insertions up to O(log n).
> >> 
> >> And what if you save an index/pointer pair after each access. Then with
> >> each new access request, you can choose from three locations to start
> >> walking: * The start of the list.
> >> * The end of the list.
> >> * The last access-point of the list.
> >> 
> >> In a lot of practical cases a new access is close to the last access. Of
> >> course, the general case would still be O(n).
> > 
> > Michiel-san, this is new data structure very different from list! If I
> > want list I never use this structure. It is like joking. Because when you
> > write this you agree that list is not vector.
> 
> Yes, the first 'trick' makes it a different datastructure. The second does
> not. Would you still be opposed to using opIndex if its time-complexity is
> O(log n)?

This is different question. And tricks are not answer for problem. Problem is list has other access method than array.  

> >> That's simple. a[i] looks much nicer than a.nth(i).
> > 
> > It is not nicer. It is more deceiving (correct spell?). If you look at
> > code it looks like array code.
> > 
> > foreach (i; 0 .. a.length)
> > {
> >     a[i] += 1;
> > }
> > 
> > For array works nice. But for list it is terrible! Many operations for
> > incrementing only small list.
> 
> With that second trick the loop would have the same complexity for lists.

Not for singly linked lists. I think name "trick" is very good. It is trick like prank to a friend. It does not do real thing. It only fools for few cases.

> But putting that aside for the moment, are you saying you would allow
> yourself to be deceived by a syntax detail? No, mentally attaching O(1) to
> the *subscripting operator* is simply a legacy from C, where it is
> syntactic sugar for pointer arithmetic.

I do not think so. I am sorry. If a[n] is not allowed then other array access primitive is allowed. Give index(a, n) as example. If language say index(a, n) is array access then it is big mistake for list to also define index(a, n). List maybe should define findAt(a, n). Then array also can define findAt(a, n). It is not mistake.

> >> By the way, I suspect that if opIndex is available only on arrays and
> >> nth() is available on all sequence types, algorithm writers will forget
> >> about opIndex and use nth(), to make their algorithm more widely
> >> compatible. And I wouldn't blame them, though I guess you would.
> > 
> > I do not agree with this. I am sorry! I think nobody should write find()
> > that uses nth().
> 
> Of course not. Find should be written with an iterator, which has optimal
> complexity for both data-structures. My point is that an algorithm should
> be generic first and foremost. Then you use the operations that have the
> lowest complexity over all targeted data-structures if possible.

Maybe I think "generic" word different than you. For me generic is that algorithm asks minimum from structure to do its work. For example find ask only one forward pass. Input iterator does one forward pass. It is mistake if find ask for index. It is also mistake if structure makes an algorithm think it has index as primitive operation.

> >> >> * A list has faster insertions and growth.
> >> > 
> >> > o(1) insertion if u have the insertion point.  both list and vector
> >> > have o(1) growth.
> >> 
> >> Yeah, but dynamic arrays have to re-allocate once in a while. Lists
> >> don't.
> > 
> > Lists allocate memory each insert. Array allocate memory some time. With
> > doubling cost of allocation+copy converge to zero.
> 
> Lists allocate memory for bare nodes, but never have to copy their elements.
> Arrays have to move their whole content to a larger memory location each
> time they are outgrown. For more complex data-types that means potentially
> very expensive copies.

I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

> >> But if time-complexity is not taken into consideration, the
> >> sequential-access interface and the random-access interface are
> >> equivalent, no?
> > 
> > I think it is mistake to not taken into consideration time complexity. For
> > basic data structures specially. I do not think you can call them
> > equivalent.
> 
> I did say 'if'. You have to agree that if you disregard complexity issues
> (for the sake of argument), the two ARE equivalent.

But it is useless comparison. Comparison can not forget important aspect. If we ignore fractionary floating point is integer. If organism is not alive it is mostly water.

> >> I do think it's a good idea for algorithms to support the interface with
> >> the weakest constraints (sequential-access). As long as they specify
> >> their time-complexity in terms of the complexities of the interface
> >> operations, not in absolute terms.
> >> 
> >> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a
> >> hypothetical analysis tool could tell him the time-complexity of the the
> >> resulting operation. The programmer might even assert a worst-case
> >> complexity at that point and the compiler could bail out if it doesn't
> >> match.
> > 
> > The specification I think is with types. If that works tool is the
> > compiler.
> 
> But don't you understand that if this tool did exist, and the language had a
> standard notation for time/space-complexity, I could simply write:
> 
> sequence<T> s;
> /* fill sequence */
> sort(s);
> 
> And the compiler (in cooperation with this 'tool') could automatically find
> the most effective combination of data-structure and algorithm. The code
> would be more readable and efficient.

Michiel-san, STL does that. Or I misunderstand you?