Why Strings as Classes?
Don
nospam at nospam.com.au
Fri Aug 29 00:29:12 PDT 2008
Steven Schveighoffer wrote:
> "Nick Sabalausky" wrote
>> "Don" <nospam at nospam.com.au> wrote in message
>> news:g95td3$2tu0$1 at digitalmars.com...
>>> Nick Sabalausky wrote:
>>>> "Don" <nospam at nospam.com.au> wrote in message
>>>> news:g95ks5$2aon$1 at digitalmars.com...
>>>>> Nick Sabalausky wrote:
>>>>>> "Dee Girl" <deegirl at noreply.com> wrote in message
>>>>>> news:g94j7a$2875$1 at digitalmars.com...
>>>>>>> I appreciate your view point. Please allow me explain. The view point
>>>>>>> is in opposition with STL. In STL each algorithm defines what kind of
>>>>>>> iterator it operates with. And it requires what iterator complexity.
>>>>>>>
>>>>>>> I agree that other design can be made. But STL has that design. In my
>>>>>>> opinion is much part of what make STL so successful.
>>>>>>>
>>>>>>> I disagree that algorithm that knows complexity of iterator is
>>>>>>> concrete. I think exactly contrary. Maybe it is good that you read
>>>>>>> book about STL by Josuttis. STL algorithms are the most generic I
>>>>>>> ever find in any language. I hope std.algorithm in D will be better.
>>>>>>> But right now std.algorithm works only with array.
>>>>>>>
>>>>>>>> If an algoritm uses [] and doesn't know the
>>>>>>>> complexity of the []...good! It shouldn't know, and it shouldn't
>>>>>>>> care. It's
>>>>>>>> the code that sends the collection to the algoritm that knows and
>>>>>>>> cares.
>>>>>>> I think this is mistake. Algorithm should know. Otherwise "linear
>>>>>>> find" is not "linear find"! It is "cuadratic find" (spell?). If you
>>>>>>> want to define something called linear find then you must know
>>>>>>> iterator complexity.
>>>>>>>
>>>>>> If a generic algorithm describes itself as "linear find" then I know
>>>>>> damn well that it's referring to the behavior of *just* the function
>>>>>> itself, and is not a statement that the function *combined* with the
>>>>>> behavior of the collection and/or a custom comparison is always going
>>>>>> to be O(n).
>>>>>>
>>>>>> A question about STL: If I create a collection that, internally, is
>>>>>> like a linked list, but starts each indexing operation from the
>>>>>> position of the last indexing operation (so that a "find first" would
>>>>>> run in O(n) instead of O(n*n)), is it possible to send that collection
>>>>>> to STL's generic "linear find first"? I would argue that it should
>>>>>> somehow be possible *even* if the STL's generic "linear find first"
>>>>>> guarantees a *total* performance of O(n) (Since, in this case, it
>>>>>> would still be O(n) anyway). Because otherwise, the STL wouldn't be
>>>>>> very extendable, which would be a bad thing for a library of "generic"
>>>>>> algorithms.
>>>>> Yes, it will work.
>>>>>
>>>>>> Another STL question: It is possible to use STL to do a "linear find"
>>>>>> using a custom comparison? If so, it is possible to make STL's "linear
>>>>>> find" function use a comparison that just happens to be O(n)? If so,
>>>>>> doesn't that violate the linear-time guarantee, too? If not, how does
>>>>>> it know that the custom comparison is O(n) instead of O(1) or O(log
>>>>>> n)?
>>>>> This will work too.
>>>>>
>>>>> IF you follow the conventions THEN the STL gives you the guarantees.
>>>> I'm not sure that's really a "guarantee" per se, but that's splitting
>>>> hairs.
>>>>
>>>> In any case, it sounds like we're all arguing more or less the same
>>>> point:
>>>>
>>>> Setting aside the issue of "should opIndex be used and when?", suppose I
>>>> have the following collection interface and find function (not
>>>> guaranteed to compile):
>>>>
>>>> interface ICollection(T)
>>>> {
>>>> T getElement(index);
>>>> int getSize();
>>>> }
>>>>
>>>> int find(T)(ICollection(T) c, T elem)
>>>> {
>>>> for(int i=0; i<c.size(); i++)
>>>> {
>>>> if(c.getElement(i) == elem)
>>>> return i;
>>>> }
>>>> }
>>>>
>>>> It sounds like STL's approach is to do something roughly like that and
>>>> say:
>>>>
>>>> "find()'s parameter 'c' should be an ICollection for which getElement()
>>>> is O(1), in which case find() is guaranteed to be O(n)"
>>>>
>>>> What I've been advocating is, again, doing something like the code above
>>>> and saying:
>>>>
>>>> "find()'s complexity is dependant on the complexity of the ICollection's
>>>> getElement(). If getElement()'s complexity is O(m), then find()'s
>>>> complexity is guaranteed to be O(m * n). Of course, this means that the
>>>> only way to get ideal complexity from find() is to use an ICollection
>>>> for which getElement() is O(1)".
>>>>
>>>> But, you see, those two statements are effectively equivilent.
>>> They are. But...
>>> if you don't adhere to the conventions, your code gets really hard to
>>> reason about.
>>>
>>> "This class has an opIndex which is in O(n). Is that OK?" Well, that
>>> depends on what it's being used for. So you have to look at all of the
>>> places where it is used.
>>>
>>> It's much simpler to use the convention that opIndex _must_ be fast; this
>>> way the performance requirements for containers and algorithms are
>>> completely decoupled from each other. It's about good design.
>>>
>> Taking a slight detour, let me ask you this... Which of the following
>> strategies do you consider to be better:
>>
>> //-- A --
>> value = 0;
>> for(int i=1; i<=10; i++)
>> {
>> value += i*2;
>> }
>>
>> //-- B --
>> value = sum(map(1..10, {n * 2}));
>>
>> Both strategies compute the sum of the first 10 multiples of 2.
>>
>> Strategy A makes the low-level implementation details very clear, but IMO,
>> it comes at the expense of high-level clarity. This is because the code
>> intermixes the high-level "what I want to accomplish?" with the low-level
>> details.
>>
>> Strategy B much more closely resembles the high-level desired result, and
>> thus makes the high-level intent more clear. But this comes at the cost of
>> hiding the low-level details behind a layer of abstraction.
>>
>> I may very well be wrong on this, but from what you've said it sounds like
>> you (as well as the other people who prefer [] to never be O(n)) are the
>> type of coder who would prefer "Strategy A". In that case, I can
>> completely understand your viewpoint on opIndex, even though I don't agree
>> with it (I'm a "Strategy B" kind of person).
>>
>> Of course, if I'm wrong on that assumption, then we're back to square one
>> ;)
>
> For me at least, you are wrong :) In fact, I view it the other way, you
> shouldn't have to care about the underlying implementation, as long as the
> runtime is well defined. If you tell me strategy B may or may not take up
> to O(n^2) to compute, then you bet your ass I'm not going to even touch
> option B, 'cause I can always get O(n) time with option A :) Your solution
> FORCES me to care about the details, it's not so much that I want to care
> about them.
I agree. It's about _which_ details do you want to abstract away. I
don't care about the internals. But I _do_ care about the complexity of
them.
More information about the Digitalmars-d
mailing list