topN using a heap

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Sun Sep 25 04:02:35 PDT 2016


On 9/25/16 4:19 AM, Jon Degenhardt wrote:
> On Saturday, 24 September 2016 at 18:22:51 UTC, Andrei Alexandrescu wrote:
>> Got curious so I tried to patch my ldc installation (0.17.1 (DMD
>> v2.068.2, LLVM 3.8.0)), but sadly that didn't work out because
>> sorting.d uses the new syntax in various places.
>>
>> Probably the best baseline is the equivalent C++ code using the mature
>> implementation of nth_element, see
>> http://paste.ofcode.org/ieZPdchjzTXbDcG3LvsYBP (also pasted at the end
>> of this message). Here's what I got:
>> http://paste.ofcode.org/6feyBLRiJtieHdZ3bmGaUW, see also text below.
>>
>> The topN code produced with dmd is faster in virtually all cases (has
>> parity for -f 4, which I suspect is a degenerate case with most
>> elements equal, which exercises only a small part of the algorithm).
>> For C++ I used -O3 -DNDEBUG, any other flags I should use?
>>
>> [snip]
>>
> I added the topN pull request to a local LDC build (current LDC master).
> Also built the C++ nth_element and sort program you listed. (GCC 4.9.3,
> compiler line: g++-mp-4.9 -O3 -static-libgcc -static-libstdc++ -std=c++11).
>
> Did 7 runs for each variant on the three fields in ngram file. Median
> times are below.
>
> Median sort times (ms):
>
>      Field 2   Field 3   Field 4
> DMD      351       348       326
> LDC      273       261       245
> C++      281       260       246
>
> Median topN / nth_element times (ms):
>
>      Field 2   Field 3   Field 4
> LDC       21        44        57
> C++       41        60        71
>
> Looking very good indeed!

Thanks for this work! -- Andrei


More information about the Digitalmars-d mailing list