How ptr arithmitic works??? It doesn't make any sense....

H. S. Teoh hsteoh at qfbox.info
Sun Dec 4 19:00:15 UTC 2022


On Sun, Dec 04, 2022 at 04:33:35PM +0000, rempas via Digitalmars-d-learn wrote:
> First a little bit of theory. A pointer just points to a memory
> address which is a number. So when I add "10" to this pointer, it will
> point ten bytes after the place it was pointing to, right?

This is true only if you're talking about pointers in the sense of
pointers in assembly language.  Languages like C and D add another layer
of abstraction over this.


> Another thing with pointers is that it doesn't have "types".

This is where you went wrong.  In assembly language, yes, a pointer
value is just a number, and there's no type associated with it.
However, experience has shown that manipulating pointers at this raw,
untyped level is extremely error-prone.  Therefore, in languages like C
or D, a pointer *does* have a type.  It's a way of preventing the
programmer from making silly mistakes, by associating a type (at
compile-time only, of course) to the pointer value.  It's a way of
keeping track that address 1234 points to a short, and not to a float,
for example.  At the assembly level, of course, this type information is
erased, and the pointers are just integer addresses.  However, at
compile-type, this type exists to prevent, or at least warn, the
programmer from treating the value at the pointed-to address as the
wrong type.  This is not only because of data sizes, but the
interpretation of data.  A 32-bit value interpreted as an int is
completely different from a 32-bit value interpreted as a float, for
example.  You wouldn't want to perform integer arithmetic on something
that's supposed to be a float; the result would be garbage.

In addition, although in theory memory is byte-addressable, many
architectures impose alignment restrictions on values larger than a
byte. For example, the CPU may require that 32-bit values (ints or
floats) must be aligned to an address that's a multiple of 4 bytes.  If
you add 1 to an int* address and try to access the result, it may cause
performance issues (the CPU may have to load 2 32-bit values and
reassemble parts of them to form the misaligned 32-bit value) or a fault
(the CPU may refuse to load a non-aligned address), which could be a
silent failure or may cause your program to be forcefully terminated.
Therefore, typed pointers like short* and int* may not be entirely an
artifact that only exists in the compiler; it may not actually be legal
to add a non-aligned value to an int*, depending on the hardware you're
running on.

Because of this, C and D implement pointer arithmetic in terms of the
underlying value type. I.e., adding 1 to a char* will add 1 to the
underlying address, but adding 1 to an int* will add int.sizeof to the
underlying address instead of 1. I.e.:

	int[2] x;
	int* p = &x[0];	// let's say this is address 1234
	p++;		// p is now 1238, *not* 1235 (int.sizeof == 4)

As a consequence, when you cast a raw pointer value to a typed pointer,
you are responsible to respect any underlying alignment requirements
that the machine may have. Casting a non-aligned address like 1235 to a
possibly-aligned pointer like int* may cause problems if you're not
careful.  Also, the value type of the pointer *does* matter; you will
get different results depending on the size of the type and any
alignment requirements it may have.  Pointer arithmetic involving T*
operate in units of T.sizeof, *not* in terms of the raw pointer value.


T

-- 
Change is inevitable, except from a vending machine.


More information about the Digitalmars-d-learn mailing list