Modulo Bug?

Thiez thiezz at gmail.com
Sun Aug 12 06:57:27 PDT 2012


On Sunday, 12 August 2012 at 08:43:53 UTC, Alex Rønne Petersen 
wrote:
> On 11-08-2012 20:54, Thiez wrote:
>> On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec 
>> wrote:
>>> On 11.08.2012 18:13, bearophile wrote:
>>>> David:
>>>>
>>>>> Thanks! I thought modulo should alawys yield the same ... 
>>>>> seems like I
>>>>> was wrong ;)
>>>>
>>>> It's C design that's wrong.
>>>
>>> And it's the processor design that makes it inefficient to 
>>> correct it
>>> nowadays.
>>>
>>> Python's definition of modulo is far more useful than C's. 
>>> Implemented
>>> in machine code, however, it takes several additional 
>>> commands because
>>> the integer division is hardwired to the C definition. I 
>>> guess that
>>> hardware integer division in processors became popular only 
>>> when C was
>>> already widely in use.
>>
>> A few extra instructions (a CMOV followed by ADD should 
>> suffice, yes?)
>> seems like a small price to pay if it can prevent bugs. Why 
>> hasn't the
>> Python-modulo been made the default back when D was designed? 
>> The
>> ever-so-slightly more efficient C-modulo could be provided in 
>> a library.
>> Of course it's way too late to change it now...
>
> Keep in mind that not all CPUs have cmov, so you end up having 
> to branch based on whether the CPU has it or not.

I think the following trick would work even on the 80386 (386+ 
should have the SETcc instructions):
assume divident in AEX, divisor in EBX or ECX (not EDX, 
obviously). I'll assume it's in EBX.
  CDQ // sign extend EAX to EDX:EAX
  IDIV EBX // signed division
  XOR EAX,EAX // clear EAX
  BT EDX,31 // check if sign-bit of remainder is set
  SETNC AL // AL = (EDX negative ? 0 : 1)
  DEC AEX // EAX = (EDX negative ? 0xFFFF : 0)
  AND EAX,EBX // EAX = (EDX negative ? EBX : 0)
  ADD EDX,EAX // EAX = the correct modulo result

Obviously it's less appealing than a CMOV, but it should work (in 
theory, I didn't test it...) and perhaps someone smarter could 
make it even shorter. Apart form the IDIV, all instructions 
should take only a single clock cycle on most x86s as far as I'm 
aware.

As for other CPU architectures, perhaps those actually have a 
'sane' modulo operator, which would make the hypothetical 
presence or absence of a CMOV irrelevant :)


More information about the Digitalmars-d mailing list