std.container & ranges

Steven Schveighoffer schveiguy at yahoo.com
Wed Nov 2 06:12:17 PDT 2011


On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <ary at esperanto.org.ar>  
wrote:

> On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
>>
>> The basic response to this is, when dealing with containers generically
>> (that is, you know you have a container, but you don't know what type),
>> the "remove this element" operation is not necessarily a good primitive
>> to have.
>>
>> Simply because from the myriad of containers, only some can implement
>> this operation efficiently. Java embeds this operation in the interface,
>> which means any interface you have to a container could potentially use
>> O(n) time to remove that element. Such an innocuous piece of syntax
>> *should* have a cost if it's not efficient IMO.
>>
>> BTW, the original question doesn't provide enough information to say
>> "remove this element." Even in Java, if you aren't using the default
>> comparison, you must use a comparator method to determine which one to
>> remove. If cell.x == x && cell.y == y *is* the comparison operator for
>> the type, then the syntax gets much simpler, because you don't need to
>> pass a specialized comparison function.
>>
>> In dcollections, removing a specific element (using the default
>> comparison operator for that element) on a *fast lookup* container is as
>> simple as:
>>
>> container.remove(container.find(x));
>>
>> Which removes the element x if it's found. However, this is not defined
>> for containers which use O(n) time to search (such as linked list), you
>> must use std.algorithm.find for that:
>>
>> container.remove(find(container[], x).begin);
>>
>> Should work, and takes O(n) time.
>>
>> -Steve
>
> I don't really understand what's wrong with inefficient methods. You can  
> have inefficient methods that are convenient, like removing an element  
> by the default comparison, or giving it a delegate to match the  
> element(s) to remove.
>
> You profile your application. Is that method the bottle-neck? If so, you  
> change it to a more efficient one. If not, you are happy you had that  
> method there, performing in an inefficient way, but which doesn't matter  
> that much compared to, say, opening an SQL connection.
>
> Programmers want to program, fast. They have schedules, they need to  
> deliver. They don't need to always find the best solution. They can find  
> a compromise between "working" and "fast", move on, and later profile  
> and worry about what matters most.
>
> Programmers don't want to fight with the language or think "Oh, so to  
> remove an element I need to use this operation and combine it with that  
> one and with that other one"...

Or use the right container for the job?

Where it really comes into play is generic programming.

Let's say I write some algorithm that removes certain elements from a  
container:

removeElements(C, T)(C c, T t[]...)
{
    foreach(x; t)
      c.remove(t);
}

What's the complexity of this algorithm?  For a HashSet, for instance, it  
will be O(n) where n is the number of elements to remove.

But for an ArrayList, it will be O(n*m) where m is the number of elements  
in c.

So what we get is, an algorithm whose complexity depends on the complexity  
of an operation that varies *widely*.  What you end up with is things like  
sorting algorithms which are O(n^2) or even worse.

What omitting those methods from the containers that don't support it does  
is allow you to make *predictably efficient* generic algorithms.  For  
example, you know sorting is going to be at most O(nlgn).  I don't care  
how beautiful the syntax is, a quadratic sort is epic fail, and should be  
avoided at all costs.

std.container tries to allow having these inefficient methods by changing  
the name (so algorithms can still claim efficiency by not using those  
names).  My stance is this makes the API overly complex for very little  
gain.  If something is inefficient, either don't use it, or use the range  
interface via std.algorithm.  Why should I write a linear search algorithm  
when std.algorithm already has done this?

-Steve


More information about the Digitalmars-d-learn mailing list