[Issue 7413] Vector literals don't work

d-bugmail at puremagic.com d-bugmail at puremagic.com
Thu May 3 03:24:34 PDT 2012


http://d.puremagic.com/issues/show_bug.cgi?id=7413



--- Comment #16 from Manu <turkeyman at gmail.com> 2012-05-03 03:25:43 PDT ---
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > (In reply to comment #12)
> > > > (In reply to comment #11)
> > > > > Haven't done the special case optimizations for constant loading.
> > > > 
> > > > No problem, I'm using GDC anyway which might detect those in the back end.
> > > > 
> > > > An efficient implementation would certainly use at least an xor for 0
> > > > initialisation, and the other tricks will get different mileage depending on
> > > > the length of the pipeline surrounding. Not accessing memory is always better
> > > > if there are pipeline cycles to soak up the latency.
> > > 
> > > The -1 trick is always worth doing, I think. Agner Fog has a nice list in his
> > > optimisation manuals, but the only ones _always_ worth doing are the 0 and -1
> > > integer cases, and the 0.0 floating point case (also using xor).
> > 
> > If the compiler knows anything about the pipeline around the code, it should be
> > able to make the best choice about the others.
> 
> My guess is that it's pretty rare that the alternative sequences are favoured
> just on the basis of the pipeline, since MOVDQA only uses a load port, and
> nothing else. Especially on Sandy Bridge or AMD, where there are two load
> ports.
> So I doubt there's much benefit to be had.
> 
> By contrast, if there's _any_ chance of a cache miss, they'd be a huge win, but
> unfortunately that's far beyond the compiler's capabilities.

And that's precisely my reasoning.

If the compiler knows the state of the pipeline around the load, and there
aren't conflicts, ie, can slip the instructions in for free between other
pipeline stalls, then generating an immediate is always better than touching
memory. Schedulers usually do have this information while performing code
generation, so it may be possible.

These sorts of considerations are obviously much more critical for non-x86
based architectures though, as with basically all optimisations ;)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list