Why Strings as Classes?

Don nospam at nospam.com.au
Fri Aug 29 00:29:12 PDT 2008


Steven Schveighoffer wrote:
> "Nick Sabalausky" wrote
>> "Don" <nospam at nospam.com.au> wrote in message 
>> news:g95td3$2tu0$1 at digitalmars.com...
>>> Nick Sabalausky wrote:
>>>> "Don" <nospam at nospam.com.au> wrote in message 
>>>> news:g95ks5$2aon$1 at digitalmars.com...
>>>>> Nick Sabalausky wrote:
>>>>>> "Dee Girl" <deegirl at noreply.com> wrote in message 
>>>>>> news:g94j7a$2875$1 at digitalmars.com...
>>>>>>> I appreciate your view point. Please allow me explain. The view point 
>>>>>>> is in opposition with STL. In STL each algorithm defines what kind of 
>>>>>>> iterator it operates with. And it requires what iterator complexity.
>>>>>>>
>>>>>>> I agree that other design can be made. But STL has that design. In my 
>>>>>>> opinion is much part of what make STL so successful.
>>>>>>>
>>>>>>> I disagree that algorithm that knows complexity of iterator is 
>>>>>>> concrete. I think exactly contrary. Maybe it is good that you read 
>>>>>>> book about STL by Josuttis. STL algorithms are the most generic I 
>>>>>>> ever find in any language. I hope std.algorithm in D will be better. 
>>>>>>> But right now std.algorithm works only with array.
>>>>>>>
>>>>>>>> If an algoritm uses [] and doesn't know the
>>>>>>>> complexity of the []...good! It shouldn't know, and it shouldn't 
>>>>>>>> care. It's
>>>>>>>> the code that sends the collection to the algoritm that knows and 
>>>>>>>> cares.
>>>>>>> I think this is mistake. Algorithm should know. Otherwise "linear 
>>>>>>> find" is not "linear find"! It is "cuadratic find" (spell?). If you 
>>>>>>> want to define something called linear find then you must know 
>>>>>>> iterator complexity.
>>>>>>>
>>>>>> If a generic algorithm describes itself as "linear find" then I know 
>>>>>> damn well that it's referring to the behavior of *just* the function 
>>>>>> itself, and is not a statement that the function *combined* with the 
>>>>>> behavior of the collection and/or a custom comparison is always going 
>>>>>> to be O(n).
>>>>>>
>>>>>> A question about STL: If I create a collection that, internally, is 
>>>>>> like a linked list, but starts each indexing operation from the 
>>>>>> position of the last indexing operation (so that a "find first" would 
>>>>>> run in O(n) instead of O(n*n)), is it possible to send that collection 
>>>>>> to STL's generic "linear find first"? I would argue that it should 
>>>>>> somehow be possible *even* if the STL's generic "linear find first" 
>>>>>> guarantees a *total* performance of O(n) (Since, in this case, it 
>>>>>> would still be O(n) anyway). Because otherwise, the STL wouldn't be 
>>>>>> very extendable, which would be a bad thing for a library of "generic" 
>>>>>> algorithms.
>>>>> Yes, it will work.
>>>>>
>>>>>> Another STL question: It is possible to use STL to do a "linear find" 
>>>>>> using a custom comparison? If so, it is possible to make STL's "linear 
>>>>>> find" function use a comparison that just happens to be O(n)? If so, 
>>>>>> doesn't that violate the linear-time guarantee, too? If not, how does 
>>>>>> it know that the custom comparison is O(n) instead of O(1) or O(log 
>>>>>> n)?
>>>>> This will work too.
>>>>>
>>>>> IF you follow the conventions THEN the STL gives you the guarantees.
>>>> I'm not sure that's really a "guarantee" per se, but that's splitting 
>>>> hairs.
>>>>
>>>> In any case, it sounds like we're all arguing more or less the same 
>>>> point:
>>>>
>>>> Setting aside the issue of "should opIndex be used and when?", suppose I 
>>>> have the following collection interface and find function (not 
>>>> guaranteed to compile):
>>>>
>>>> interface ICollection(T)
>>>> {
>>>>     T getElement(index);
>>>>     int getSize();
>>>> }
>>>>
>>>> int find(T)(ICollection(T) c, T elem)
>>>> {
>>>>     for(int i=0; i<c.size(); i++)
>>>>     {
>>>>  if(c.getElement(i) == elem)
>>>>             return i;
>>>>     }
>>>> }
>>>>
>>>> It sounds like STL's approach is to do something roughly like that and 
>>>> say:
>>>>
>>>> "find()'s parameter 'c' should be an ICollection for which getElement() 
>>>> is O(1), in which case find() is guaranteed to be O(n)"
>>>>
>>>> What I've been advocating is, again, doing something like the code above 
>>>> and saying:
>>>>
>>>> "find()'s complexity is dependant on the complexity of the ICollection's 
>>>> getElement(). If getElement()'s complexity is O(m), then find()'s 
>>>> complexity is guaranteed to be O(m * n). Of course, this means that the 
>>>> only way to get ideal complexity from find() is to use an ICollection 
>>>> for which getElement() is O(1)".
>>>>
>>>> But, you see, those two statements are effectively equivilent.
>>> They are. But...
>>> if you don't adhere to the conventions, your code gets really hard to 
>>> reason about.
>>>
>>> "This class has an opIndex which is in O(n). Is that OK?" Well, that 
>>> depends on what it's being used for. So you have to look at all of the 
>>> places where it is used.
>>>
>>> It's much simpler to use the convention that opIndex _must_ be fast; this 
>>> way the performance requirements for containers and algorithms are 
>>> completely decoupled from each other. It's about good design.
>>>
>> Taking a slight detour, let me ask you this... Which of the following 
>> strategies do you consider to be better:
>>
>> //-- A --
>> value = 0;
>> for(int i=1; i<=10; i++)
>> {
>>    value += i*2;
>> }
>>
>> //-- B --
>> value = sum(map(1..10, {n * 2}));
>>
>> Both strategies compute the sum of the first 10 multiples of 2.
>>
>> Strategy A makes the low-level implementation details very clear, but IMO, 
>> it comes at the expense of high-level clarity. This is because the code 
>> intermixes the high-level "what I want to accomplish?" with the low-level 
>> details.
>>
>> Strategy B much more closely resembles the high-level desired result, and 
>> thus makes the high-level intent more clear. But this comes at the cost of 
>> hiding the low-level details behind a layer of abstraction.
>>
>> I may very well be wrong on this, but from what you've said it sounds like 
>> you (as well as the other people who prefer [] to never be O(n)) are the 
>> type of coder who would prefer "Strategy A". In that case, I can 
>> completely understand your viewpoint on opIndex, even though I don't agree 
>> with it (I'm a "Strategy B" kind of person).
>>
>> Of course, if I'm wrong on that assumption, then we're back to square one 
>> ;)
> 
> For me at least, you are wrong :)  In fact, I view it the other way, you 
> shouldn't have to care about the underlying implementation, as long as the 
> runtime is well defined.  If you tell me strategy B may or may not take up 
> to O(n^2) to compute, then you bet your ass I'm not going to even touch 
> option B, 'cause I can always get O(n) time with option A :)  Your solution 
> FORCES me to care about the details, it's not so much that I want to care 
> about them.

I agree.  It's about _which_ details do you want to abstract away. I 
don't care about the internals. But I _do_ care about the complexity of 
them.



More information about the Digitalmars-d mailing list