Need help with Removing Duplicates from Array

Steven Schveighoffer schveiguy at gmail.com
Tue Jun 27 12:55:57 UTC 2023


On 6/27/23 8:45 AM, Sergey wrote:
> On Tuesday, 27 June 2023 at 12:31:27 UTC, Steven Schveighoffer wrote:
>> On 6/27/23 8:13 AM, sonal13 wrote:
>>> Hey everyone,
>>>
>> This also is not the learn forum, please use that if you have further 
>> questions about D.
> 
> To make this post more “general”, we could complain a bit, that D 
> doesn’t have “set” data structure :) we could emulate it with AA - but 
> it is meh ..

std.container.rbtree.RedBlackTree is a set.

> 
> And correct me if I’m wrong asymptotically usage of set should be more 
> effective than sorting and then removing duplicates :P

Set insertion is likely more expensive than sorting -- it uses 
allocations for every insertion. Sorting is done in-place. In fact, I 
could have avoided the extra allocation of a new array as well, but it's 
easier to just write `.array`.

Algorithmically, if you use a hashset, the complexity is supposed to be 
O(1) for insertion, but that is best-case. And quicksort is pretty fast 
too, and likely has better cache coherency.

I'd say it's something you should measure for the use cases you are 
thinking. Either one may be faster in certain cases.

-Steve


More information about the Digitalmars-d mailing list