narrowed down the problem area
downs
default_357-line at yahoo.de
Fri Feb 15 08:34:19 PST 2008
downs wrote:
> Here's the disassembly for ray_sphere for both cases:
>
> slow (opSub)
>
> http://paste.dprogramming.com/dpcds3p3
>
> fast
>
> http://paste.dprogramming.com/dpd6pi8n
>
> So it comes down to a GDC FP "bug". I think changing to 4.2 or 4.3 might help. Does anybody have an up-to-date version of the 4.2.x patch?
>
> --downs
Especially interesting to note (slow case):
fstpl -24(%ebp)
[...]
movl -24(%ebp), %eax
movl %eax, -48(%ebp)
movl -20(%ebp), %eax
movl %eax, -44(%ebp)
Translation:
Store floating-point number to ebp[-24]. No, wait, move it to ebp[-48].
This indicates a pretty serious problem with optimization, since the whole thing is basically redundant.
The "fast" version doesn't have any memory writes at all during the computation.
--downs
More information about the Digitalmars-d
mailing list