[OT] The Usual Arithmetic Confusions

Fri Feb 4 23:55:14 UTC 2022

On Friday, 4 February 2022 at 23:43:28 UTC, Walter Bright wrote:
> The penalty for byte arithmetic is the shortage of registers.

On 64-bit, there are as many byte registers as word registers.  
(More, technically, but the high-half registers should be avoided 
at all costs.)

> Implying I have nefarious motives here is not called for.

Yes.  My bad.

>> these sorts of changes are marginal and should not get in the 
>> way of correct semantics.
>
> That's fine unless you're using a systems programming language, 
> where the customers expect performance.

If a customer wants int ops to be generated, they can use ints.  
There is nothing preventing them from doing this, as has been 
pointed out else-thread.

>> 3. Your code example actually does exactly what you 
>> suggest--using short arithmetic for storage.
>
> The load instructions still use the extra operand size override 
> bytes.

I do not follow.  Your post said:

> Generally speaking, int should be used for most calculations, 
> short and byte for storage.

How am I to store shorts without an operand-size override prefix?

>> It just happens that in this case using short calculations 
>> rather than int calculations yields the same result and 
>> smaller code.
>
> It's not "just happens". Every short load will incur an extra 
> byte. I compiled it with gcc -O, too, just so nobody will 
> accuse me of sabotaging the result with dmd.

In this case I was referring to the multiply.  It was possible to 
load the second register, perform a 32-bit multiply, and then 
store the truncated result.  In a different context, this might 
have been worthwhile.

>> 4. (continued from 3) in a larger, more interesting 
>> expression, regardless of language semantics, the compiler 
>> will generally be free to use ints for intermediates.
>
> If it does, then you'll have other truncation problems 
> depending on how the optimization of the expression plays out. 
> Unless you went the x87 route and slowed everything down by 
> truncating every subexpression to short.

Example: ubyte x,y,z,w; w = x + y + z.

(((x + y) mod 2^32 mod 2^8) + z) mod 2^32 mod 2^8 is the same as 
(((x + y) mod 2^32) + z) mod 2^32 mod 2^8.  The mod 2^32 are 
implicit in the use of 32-bit registers; the mod 2^8 are explicit 
truncation.  The former form, with two explicit truncations, can 
be rewritten as the latter form, getting rid of the intermediate 
truncation, giving the exact same result as with promotion.