DMD 1.034 and 2.018 releases - Let the games begin!

Dave Dave_member at pathlink.com
Mon Aug 11 21:17:42 PDT 2008


"Dave" <Dave_member at pathlink.com> wrote in message 
news:g7qr3h$2l6$1 at digitalmars.com...
>
> "Walter Bright" <newshound1 at digitalmars.com> wrote in message 
> news:g7na5s$qg0$1 at digitalmars.com...
>> bearophile wrote:
>>> Walter Bright:
>>>> If this happens, then it's worth verifying that the asm code is
>>>> actually being run by inserting a printf in it.
>>>
>>> I presume I'll have to recompile Phobos for that.
>>
>> Not really, it's easier to just copy that particular function out of the
>> library and paste it into your test module, that way it's easier to
>> experiment with.
>>
>>>>> And I haven't seen yet SS2 asm in my compiled programs :-)
>>>> The dmd compiler doesn't generate SS2 instructions. But the
>>>> routines in internal\array*.d do.
>>>
>>> I know. I was talking about the parts of the code that for example
>>> adds the arrays; according to the phobos source code they use SSE2
>>> but in the final source code produces they are absent.
>>
>> I don't know what you mean. The SSE2 instructions are in 
>> internal/arrayint.d, and they do get compiled in.
>
> The SSE2 is being used, but what would be nice would be the same code that 
> Burton used for his benchmarks. Is that available?
>
> Thanks,
>
> - Dave
>

Before:

>
> C:\Zz>top 4000 100000
> Array Size = 4000, Iterations = 100000
> intaops: 0.204 secs, sum = 2000000
> intloop: 0.515 secs, sum = 2000000
> dfpaops: 0.625 secs, sum = 2e+06
> dfploop: 0.563 secs, sum = 2e+06
>

After adding aligned case for _arraySliceSliceAddSliceAssign_d

C:\Zz>top 4000 100000
Array Size = 4000, Iterations = 100000
intaops: 0.212 secs, sum = 2000000
intloop: 0.525 secs, sum = 2000000
dfpaops: 0.438 secs, sum = 2e+06
dfploop: 0.557 secs, sum = 2e+06

;---

SiSoftware Sandra

Processor
Model : Intel(R) Core(TM)2 CPU          6700  @ 2.66GHz

Processor Cache(s)
Internal Data Cache : 32kB, Synchronous, Write-Thru, 8-way set, 64 byte line 
size
Internal Instruction Cache : 32kB, Synchronous, Write-Back, 8-way set, 64 
byte line size
L2 On-board Cache : 4MB, ECC, Synchronous, ATC, 16-way set, 64 byte line 
size, 2 threads sharing
L2 Cache Multiplier : 1/1x  (2667MHz)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: aligned.d
Type: application/octet-stream
Size: 3000 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d-announce/attachments/20080811/436d214a/attachment.obj>


More information about the Digitalmars-d-announce mailing list