[phobos] byte alignment for arrays
Jason Spencer
spencer8 at sbcglobal.net
Mon Jun 28 14:03:04 PDT 2010
Hmmm. The natural thing would be to have some type to describe these 128-bit values (akin to __m128 in gcc, Intel and MS compilers) and use sizeof on that. I don't see that D has any MMX/SSE intrinsics, so I don't know if there is a standard type. If you don't have such a thing defined by the compiler, I'd be tempted to define it, based on which version of the compiler will compile this code (i.e. 32- or 64-bit dmd). Then you can use that in your sizeof. Maybe you'll get lucky, and that will become standard :)
Jason
----- Original Message ----
> From: Steve Schveighoffer <schveiguy at yahoo.com>
> To: Discuss the phobos library for D <phobos at puremagic.com>
> Sent: Mon, June 28, 2010 1:35:59 PM
> Subject: Re: [phobos] byte alignment for arrays
>
> Thanks, this information helps a lot!
I will make the change to 16-byte
> aligned. I'm already using 8 bytes for a 4 byte length. Using 16
> bytes isn't much different, especially when the block size is 4096+
> bytes.
One final question -- I currently use sizeof(size_t) * 2, which
> could now be sizeof(size_t) * 4, but of course, this changes to 32 bytes on
> 64-bit dmd. Would it make sense to just use 16 instead of some multiple of
> size_t?
-Steve
----- Original Message ----
> From: Jason
> Spencer <
> href="mailto:spencer8 at sbcglobal.net">spencer8 at sbcglobal.net>
> To:
> Discuss the phobos library for D <
> href="mailto:phobos at puremagic.com">phobos at puremagic.com>
> Sent:
> Mon, June 28, 2010 4:09:01 PM
> Subject: Re: [phobos] byte alignment for
> arrays
>
> Sorry, I forgot to address the every-other-one
> concern.
The MMX registers
> are 64-bits, so you can only do 1
> double at a time. Those instructions
> only require 8-byte aligned
> memory. The SSE instructions use 128-bit
> registers, so they take
> 2 doubles at a time. As long as the first one is
> 16-byte aligned,
> you can iterate through on 16-byte (128 bits) chunks, and
> you'll be
> good. That's why element 0 should be 128-aligned.
If it's
>
> not, the processor will either have an alignment fault (in the instruction
>
> requires alignment) or will do a bunch of split-loads across cache
> lines, which
> kill performance.
One other thought:
> If you wanted to be
> tricky, you could do a general, 4-byte allocation
> and based on the address you
> get, assign your storage pointer to the
> next 128-aligned address. But
> you're offloading to run-time lot's
> of housekeeping. Again, maybe
> tolerable for just these large
> arrays. But it starts to add a lot of
> corner cases. Walter
> might have some good suggestions
>
> here.
Jason
----- Original Message ----
> From:
>
> Steve Schveighoffer <
> A question then -- let's say
> you have
> an array of
> doubles, which are 8 bytes wide, and you
> want to
>
> use these SSE instructions. Even if the first
> one is aligned on a 16-byte
>
> boundary, wouldn't every other
> double be
>
>
> misaligned?
_______________________________________________
phobos mailing
>
> list
> href="mailto:
> href="mailto:phobos at puremagic.com">phobos at puremagic.com">
> ymailto="mailto:phobos at puremagic.com"
> href="mailto:phobos at puremagic.com">phobos at puremagic.com
http://lists.puremagic.com/mailman/listinfo/phobos
>
_______________________________________________
phobos
> mailing list
> href="mailto:phobos at puremagic.com">phobos at puremagic.com
> href="http://lists.puremagic.com/mailman/listinfo/phobos" target=_blank
> >http://lists.puremagic.com/mailman/listinfo/phobos
More information about the phobos
mailing list