New malloc() for win32 that should produce faster DMD's and faster D code that uses malloc()

Tue Aug 6 11:39:43 PDT 2013

On Tuesday, 6 August 2013 at 18:38:43 UTC, Kiith-Sa wrote:
> On Tuesday, 6 August 2013 at 17:48:57 UTC, Walter Bright wrote:
>> On 8/6/2013 5:13 AM, Richard Webb wrote:
>>> It's possible that other library routines are causing some of 
>>> the remaining
>>> difference from the MSVC build (e.g. the profiler suggests 
>>> that the DMC build
>>> spends somewhat more time inside memcpy than the MSVC build).
>>>
>>> Not sure if it's down to implementation or optimization 
>>> though - might be down
>>> to intrinsics/inlining and such? (the proflie for the DMC 
>>> build says it's using
>>> ~1% of its time inside strlen and the profile for the MSVC 
>>> build doesn't mention
>>> it at all, which i guess is because it's using an intrinsic 
>>> version).
>>
>>
>> If it's inlined then it won't show up in the profile. And yes, 
>> it's possible MSVC has a faster memcpy(). After all, enormous 
>> effort has been poured into memcpy().
>
> If you use a profiler with line or instruction granularity
> (like perf on Linux), it will show up. On Windows, that would 
> probably
> be VTune and CodeAnalyst.

(obviously, as a part of the function it was inlined into,
but you'll get the time consumed at lines/instructions from the 
inlined function)