How ptr arithmitic works??? It doesn't make any sense....

Mon Dec 5 14:17:53 UTC 2022

On Sunday, 4 December 2022 at 19:00:15 UTC, H. S. Teoh wrote:
> This is true only if you're talking about pointers in the sense 
> of pointers in assembly language.  Languages like C and D add 
> another layer of abstraction over this.
>
>
>> Another thing with pointers is that it doesn't have "types".
>
> This is where you went wrong.  In assembly language, yes, a 
> pointer value is just a number, and there's no type associated 
> with it. However, experience has shown that manipulating 
> pointers at this raw, untyped level is extremely error-prone.  
> Therefore, in languages like C or D, a pointer *does* have a 
> type.  It's a way of preventing the programmer from making 
> silly mistakes, by associating a type (at compile-time only, of 
> course) to the pointer value.  It's a way of keeping track that 
> address 1234 points to a short, and not to a float, for 
> example.  At the assembly level, of course, this type 
> information is erased, and the pointers are just integer 
> addresses.  However, at compile-type, this type exists to 
> prevent, or at least warn, the programmer from treating the 
> value at the pointed-to address as the wrong type.  This is not 
> only because of data sizes, but the interpretation of data.  A 
> 32-bit value interpreted as an int is completely different from 
> a 32-bit value interpreted as a float, for example.  You 
> wouldn't want to perform integer arithmetic on something that's 
> supposed to be a float; the result would be garbage.
>
> In addition, although in theory memory is byte-addressable, 
> many architectures impose alignment restrictions on values 
> larger than a byte. For example, the CPU may require that 
> 32-bit values (ints or floats) must be aligned to an address 
> that's a multiple of 4 bytes.  If you add 1 to an int* address 
> and try to access the result, it may cause performance issues 
> (the CPU may have to load 2 32-bit values and reassemble parts 
> of them to form the misaligned 32-bit value) or a fault (the 
> CPU may refuse to load a non-aligned address), which could be a 
> silent failure or may cause your program to be forcefully 
> terminated. Therefore, typed pointers like short* and int* may 
> not be entirely an artifact that only exists in the compiler; 
> it may not actually be legal to add a non-aligned value to an 
> int*, depending on the hardware you're running on.
>
> Because of this, C and D implement pointer arithmetic in terms 
> of the underlying value type. I.e., adding 1 to a char* will 
> add 1 to the underlying address, but adding 1 to an int* will 
> add int.sizeof to the underlying address instead of 1. I.e.:
>
> 	int[2] x;
> 	int* p = &x[0];	// let's say this is address 1234
> 	p++;		// p is now 1238, *not* 1235 (int.sizeof == 4)
>
> As a consequence, when you cast a raw pointer value to a typed 
> pointer, you are responsible to respect any underlying 
> alignment requirements that the machine may have. Casting a 
> non-aligned address like 1235 to a possibly-aligned pointer 
> like int* may cause problems if you're not careful.  Also, the 
> value type of the pointer *does* matter; you will get different 
> results depending on the size of the type and any alignment 
> requirements it may have.  Pointer arithmetic involving T* 
> operate in units of T.sizeof, *not* in terms of the raw pointer 
> value.
>
>
> T

Wow! Seriously, thanks a lot for this detailed explanation! I 
want to write a compiler and this type of explanations that not 
only give me the answer but explain me in detail why something 
happens are a gift for me! I wish I could meet you in person and 
buy you a coffee. Maybe one day, you never know! Thanks a lot and 
have an amazing day!