Integer conversions too pedantic in 64-bit

Russel Winder russel at russel.org.uk
Thu Feb 17 02:01:33 PST 2011


<minor-rant>

On Thu, 2011-02-17 at 10:13 +0100, Don wrote:
[ . . . ]
> Me too. A word is two bytes. Any other definition seems to be pretty 
> useless.

Sounds like people have been living with 8- and 16-bit processors for
too long.

A word is the natural length of an integer item in the processor.  It is
necessarily machine specific.  cf. DEC-10 had 9-bit bytes and 36-bit
word, IBM 370 has an 8-bit byte and a 32-bit word, though addresses were
24-bit.  ix86 follows IBM 8-bit byte and 32-bit word.

The really interesting question is whether on x86_64 the word is 32-bit
or 64-bit.

> The whole concept of "machine word" seems very archaic and incorrect to 
> me anyway. It assumes that the data registers and address registers are 
> the same size, which is very often not true.

Machine words are far from archaic, even on the JVM, if you don't know
the length of the word on the machine you are executing on, how do you
know the set of values that can be represented?  In floating point
numbers, if you don't know the length of the word, how do you know the
accuracy of the computation?

Clearly data registers and address registers can be different lengths,
it is not the job of a programming language that compiles to native code
to ignore this and attempt to homogenize things beyond what is
reasonable.

If you are working in native code then word length is a crucial property
since it can change depending on which processor you compile for.

> For example, on an 8-bit machine (eg, 6502 or Z80), the accumulator was 
> only 8 bits, yet size_t was definitely 16 bits.

The 8051 was only surpassed a couple of years ago by ARMs as the most
numerous processor on the planet.  8-bit processors may only have had
8-bit ALUs -- leading to an hypothesis that the word was 8-bits -- but
the word length was effectively 16-bit due to the hardware support for
multi-byte integer operations.

> It's quite plausible that at some time in the future we'll get a machine 
> with 128-bit registers and data bus, but retaining the 64 bit address 
> bus. So we could get a size_t which is smaller than the machine word.
> 
> In summary: size_t is not the machine word.

Agreed !

As long as the address bus is less wide than an integer, there are no
apparent problems using integers as addresses.  The problem comes when
addresses are wider than integers.  A good statically-typed programming
language should manage this by having integers and addresses as distinct
sets.  C and C++ have led people astray.  There should be an appropriate
set of integer types and an appropriate set of address types and using
one from the other without active conversion is always going to lead to
problems.

Do not be afraid of the word.  Fear leads to anger.  Anger leads to
hate.  Hate leads to suffering. (*)

</minor-rant>

(*) With apologies to Master Yoda (**) for any misquote.

(**) Or more likely whoever his script writer was.
-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder at ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel at russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20110217/5dc268cb/attachment.pgp>


More information about the Digitalmars-d mailing list