[GSoC] 'Independency of D from the C Standard Library' progress and update thread

Piotrek dummy at dummy.gov
Sat Jul 6 15:33:44 UTC 2019


On Saturday, 6 July 2019 at 11:07:41 UTC, Stefanos Baziotis wrote:
>> As for the benchmarks.
>> I think you can post your results somewhere. Or you did. 
>> Unfortunately I cannot find them.
>
> You're right, my mistake, there are no recent benchmarks. I'll 
> try to post
> today. They're similar to yours.


>> I tested Dmemset with dmd (lcd and gdc didn't compile) on 
>> i3-3220 at 3.30GHz (Ubuntu).
>
> That's weird. Could you give some more info on how did you 
> compile?

I used the old repo for Dmemset. With Dmemutils it works now. I 
removed static foreach from benchmark.d in order to run gdc.
Text results:
https://github.com/PiotrekDlang/Dmemutils/tree/master/Dmemset/output

>
>> The strange thing is I get different results when I change the 
>> following line in benchamrks.d
>> So D version becomes better. Maybe this is related to 
>> different binary file after compilation.
>
> That is indeed strange but not too unexpected. A compiler (more 
> possible in
> the DMD back-end) might decide to do strange things for reasons 
> I don't know.
> I'll try to re-create similar behavior in mine.

It seems it wasn't related to this change. Looks like heisen 
optimization.

>
> Just some more info for anyone interested:
> Regarding sizes 1-16. With GDC / LDC, in my benchmarks
> (and by reading the ASM, I assume in all the benchmarks), it 
> reaches parity
> with libc (note that for sizes 1-16 the naive version is used,
> meaning, a simple for loop). Now, for such small sizes, the 
> standard way to go
> is a fall-through switch (I can give more info on that if 
> someone is interested).
> The problem with that is that it's difficult to be optimized 
> along with the rest
> of the code. Meaning, by the compiler. Or at least, I didn't 
> find a way
> to do it. And so, I use the naive version which is only 
> slightly slower but
> doesn't affect bigger sizes.

Funnily enough, DMD (with Dmemset) holds the speed record, over 
50 GB/s, copying some big block sizes.
However, aren't smaller sizes more important?


> My guess is that especially with GDC / LDC (and DMD, but I'm 
> not yet sure
> for DMD across different hardware), Dmemset can actually 
> replace libc memset().

One issue is it should be tested on all variation of HW and OS.
At least it can be placed in experimental module.


Cheers,
Piotrek



More information about the Digitalmars-d mailing list