Why Strings as Classes?

Steven Schveighoffer schveiguy at yahoo.com
Thu Aug 28 13:15:17 PDT 2008


"Nick Sabalausky" wrote
> "Don" <nospam at nospam.com.au> wrote in message 
> news:g95td3$2tu0$1 at digitalmars.com...
>> Nick Sabalausky wrote:
>>> "Don" <nospam at nospam.com.au> wrote in message 
>>> news:g95ks5$2aon$1 at digitalmars.com...
>>>> Nick Sabalausky wrote:
>>>>> "Dee Girl" <deegirl at noreply.com> wrote in message 
>>>>> news:g94j7a$2875$1 at digitalmars.com...
>>>>>> I appreciate your view point. Please allow me explain. The view point 
>>>>>> is in opposition with STL. In STL each algorithm defines what kind of 
>>>>>> iterator it operates with. And it requires what iterator complexity.
>>>>>>
>>>>>> I agree that other design can be made. But STL has that design. In my 
>>>>>> opinion is much part of what make STL so successful.
>>>>>>
>>>>>> I disagree that algorithm that knows complexity of iterator is 
>>>>>> concrete. I think exactly contrary. Maybe it is good that you read 
>>>>>> book about STL by Josuttis. STL algorithms are the most generic I 
>>>>>> ever find in any language. I hope std.algorithm in D will be better. 
>>>>>> But right now std.algorithm works only with array.
>>>>>>
>>>>>>> If an algoritm uses [] and doesn't know the
>>>>>>> complexity of the []...good! It shouldn't know, and it shouldn't 
>>>>>>> care. It's
>>>>>>> the code that sends the collection to the algoritm that knows and 
>>>>>>> cares.
>>>>>> I think this is mistake. Algorithm should know. Otherwise "linear 
>>>>>> find" is not "linear find"! It is "cuadratic find" (spell?). If you 
>>>>>> want to define something called linear find then you must know 
>>>>>> iterator complexity.
>>>>>>
>>>>> If a generic algorithm describes itself as "linear find" then I know 
>>>>> damn well that it's referring to the behavior of *just* the function 
>>>>> itself, and is not a statement that the function *combined* with the 
>>>>> behavior of the collection and/or a custom comparison is always going 
>>>>> to be O(n).
>>>>>
>>>>> A question about STL: If I create a collection that, internally, is 
>>>>> like a linked list, but starts each indexing operation from the 
>>>>> position of the last indexing operation (so that a "find first" would 
>>>>> run in O(n) instead of O(n*n)), is it possible to send that collection 
>>>>> to STL's generic "linear find first"? I would argue that it should 
>>>>> somehow be possible *even* if the STL's generic "linear find first" 
>>>>> guarantees a *total* performance of O(n) (Since, in this case, it 
>>>>> would still be O(n) anyway). Because otherwise, the STL wouldn't be 
>>>>> very extendable, which would be a bad thing for a library of "generic" 
>>>>> algorithms.
>>>> Yes, it will work.
>>>>
>>>>> Another STL question: It is possible to use STL to do a "linear find" 
>>>>> using a custom comparison? If so, it is possible to make STL's "linear 
>>>>> find" function use a comparison that just happens to be O(n)? If so, 
>>>>> doesn't that violate the linear-time guarantee, too? If not, how does 
>>>>> it know that the custom comparison is O(n) instead of O(1) or O(log 
>>>>> n)?
>>>> This will work too.
>>>>
>>>> IF you follow the conventions THEN the STL gives you the guarantees.
>>>
>>> I'm not sure that's really a "guarantee" per se, but that's splitting 
>>> hairs.
>>>
>>> In any case, it sounds like we're all arguing more or less the same 
>>> point:
>>>
>>> Setting aside the issue of "should opIndex be used and when?", suppose I 
>>> have the following collection interface and find function (not 
>>> guaranteed to compile):
>>>
>>> interface ICollection(T)
>>> {
>>>     T getElement(index);
>>>     int getSize();
>>> }
>>>
>>> int find(T)(ICollection(T) c, T elem)
>>> {
>>>     for(int i=0; i<c.size(); i++)
>>>     {
>>>  if(c.getElement(i) == elem)
>>>             return i;
>>>     }
>>> }
>>>
>>> It sounds like STL's approach is to do something roughly like that and 
>>> say:
>>>
>>> "find()'s parameter 'c' should be an ICollection for which getElement() 
>>> is O(1), in which case find() is guaranteed to be O(n)"
>>>
>>> What I've been advocating is, again, doing something like the code above 
>>> and saying:
>>>
>>> "find()'s complexity is dependant on the complexity of the ICollection's 
>>> getElement(). If getElement()'s complexity is O(m), then find()'s 
>>> complexity is guaranteed to be O(m * n). Of course, this means that the 
>>> only way to get ideal complexity from find() is to use an ICollection 
>>> for which getElement() is O(1)".
>>>
>>> But, you see, those two statements are effectively equivilent.
>>
>> They are. But...
>> if you don't adhere to the conventions, your code gets really hard to 
>> reason about.
>>
>> "This class has an opIndex which is in O(n). Is that OK?" Well, that 
>> depends on what it's being used for. So you have to look at all of the 
>> places where it is used.
>>
>> It's much simpler to use the convention that opIndex _must_ be fast; this 
>> way the performance requirements for containers and algorithms are 
>> completely decoupled from each other. It's about good design.
>>
>
> Taking a slight detour, let me ask you this... Which of the following 
> strategies do you consider to be better:
>
> //-- A --
> value = 0;
> for(int i=1; i<=10; i++)
> {
>    value += i*2;
> }
>
> //-- B --
> value = sum(map(1..10, {n * 2}));
>
> Both strategies compute the sum of the first 10 multiples of 2.
>
> Strategy A makes the low-level implementation details very clear, but IMO, 
> it comes at the expense of high-level clarity. This is because the code 
> intermixes the high-level "what I want to accomplish?" with the low-level 
> details.
>
> Strategy B much more closely resembles the high-level desired result, and 
> thus makes the high-level intent more clear. But this comes at the cost of 
> hiding the low-level details behind a layer of abstraction.
>
> I may very well be wrong on this, but from what you've said it sounds like 
> you (as well as the other people who prefer [] to never be O(n)) are the 
> type of coder who would prefer "Strategy A". In that case, I can 
> completely understand your viewpoint on opIndex, even though I don't agree 
> with it (I'm a "Strategy B" kind of person).
>
> Of course, if I'm wrong on that assumption, then we're back to square one 
> ;)

For me at least, you are wrong :)  In fact, I view it the other way, you 
shouldn't have to care about the underlying implementation, as long as the 
runtime is well defined.  If you tell me strategy B may or may not take up 
to O(n^2) to compute, then you bet your ass I'm not going to even touch 
option B, 'cause I can always get O(n) time with option A :)  Your solution 
FORCES me to care about the details, it's not so much that I want to care 
about them.

-Steve 





More information about the Digitalmars-d mailing list