narrowed down the problem area

downs default_357-line at yahoo.de
Fri Feb 15 08:34:19 PST 2008


downs wrote:
> Here's the disassembly for ray_sphere for both cases:
> 
> slow (opSub)
> 
> http://paste.dprogramming.com/dpcds3p3
> 
> fast
> 
> http://paste.dprogramming.com/dpd6pi8n
> 
> So it comes down to a GDC FP "bug". I think changing to 4.2 or 4.3 might help. Does anybody have an up-to-date version of the 4.2.x patch?
> 
>  --downs

Especially interesting to note (slow case):

    fstpl    -24(%ebp)
[...]
    movl    -24(%ebp), %eax
    movl    %eax, -48(%ebp)
    movl    -20(%ebp), %eax
    movl    %eax, -44(%ebp)

Translation:
	Store floating-point number to ebp[-24]. No, wait, move it to ebp[-48].

This indicates a pretty serious problem with optimization, since the whole thing is basically redundant.

The "fast" version doesn't have any memory writes at all during the computation.

 --downs



More information about the Digitalmars-d mailing list