Array append performance
Robert Jacques
sandford at jhu.edu
Sun Aug 24 19:00:30 PDT 2008
On Sat, 23 Aug 2008 13:16:26 -0400, Steven Schveighoffer
<schveiguy at yahoo.com> wrote:
> The problem is that many times I don't append to an array or slice, why
> should I have to accept the cost of an extra 4-8 byte field for every
> slice
> and array that isn't going to change size (which I would argue is used
> more
> often)?
I've written some micro benchmarks to test out the difference between an
8-byte struct (i.e. the current arrays) and a 16-byte struct (ptr, length,
capacity, stride) with respect to function call performance. I wrote 2
benchmarks: 1) an actual function call (which calculated the exponential)
and 2) an optimized struct copy (i.e. simulating pass by value).
In release -o or debug mode, (1) had performance decreases of -5.1%, 9.8%
and 1.8% for passing 1, 2 and 3 structs respectively. When -inline is
enabled this changes to 0.5%, 2.5% and 5% respectively. For the (2)
benchmark, I looked at using mmx and sse2 instructions to accelerate the
struct copy. Using sse2 instructions provided a 15% performance gain using
unaligned structs and an 84% gain using aligned structs. 16-byte SSE2
copies beat single 8-byte MMX copies by 9% (both aligned), though dual MMX
(i.e. a 16-byte copy) still beat the built-in copy by 61% (again aligned).
(All in -release -o -inline, on a Core2 CPU)
So the performance cost, even for very small workloads, looks to be pretty
minor and with some work performance could actually increase. (I'm
assuming the use of sse/mmx/AltiVec for memcopy acceleration would be
enabled by a complier flag on 32-bit)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hello.d
Type: application/octet-stream
Size: 5966 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20080824/e032da3e/attachment.obj>
More information about the Digitalmars-d
mailing list