topN using a heap

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Tue Jan 19 05:20:14 PST 2016


On 01/18/2016 09:42 PM, Xinok wrote:
> To be clear, sort is NOT attackable. Introsort is used for unstable
> sorting which begins with quick sort but falls back to heap sort after
> too many recursions to maintain O(n log n) running time. Timsort is used
> for stable sorting which is a variant of merge sort but still runs in
> O(n log n) time. In either case, you're guaranteed to have O(n log n)
> running time in the worst case.

Sorry, my bad. Could you please point to the code doing the 
introspection? I might do the same in topN.

> On the other hand, someone can improve upon quick sort by choosing
> better pivots but be careful not to add too much overhead. A couple
> years ago, I tried choosing the pivot from a median of five but the
> result was as much as 10% slower. One idea is to begin initially
> choosing better pivots, but once the sublists fall below a certain
> length, simply choose the pivot from a median of three to avoid the
> extra overhead.

That's what https://github.com/D-Programming-Language/phobos/pull/3922 
and in my measurements it's about as fast or faster than the current. 
Could you please re-run your measurements against that PR?

> Speaking of RNGs, they're technically pure as long as you always use the
> same seed. So what if we generated a random seed at the start of the
> process, but then reused that same seed over and over in pure functions
> for the duration of the process?

That's a great idea. The way D is defined, it already implies that 
"immutable" is process-immutable, not immutable across runs. Consider 
http://dpaste.dzfl.pl/2fa5baf0c149.

Very nice insights, Xinok. Thanks!


Andrei




More information about the Digitalmars-d mailing list