byte and short data types use cases

Fri Jun 9 15:58:49 UTC 2023

On Fri, Jun 09, 2023 at 11:24:38AM +0000, Murloc via Digitalmars-d-learn wrote:
[...]
> Which raised another question: since objects of types smaller than
> `int` are promoted to `int` to use integer arithmetic on them anyway,
> is there any point in using anything of integer type less than `int`
> other than to limit the range of values that can be assigned to a
> variable at compile time?

Not just at compile time, at runtime they will also be fixed to that
width (mapped to a hardware register of that size) and will not be able
to contain a larger value.

[...]
> People say that there is no advantage for using `byte`/`short` type
> for integer objects over an int for a single variable, however, as
> they say, this is not true for arrays, where you can save some memory
> space by using `byte`/`short` instead of `int`.

That's correct.

> But isn't any further manipulations with these array objects will
> produce results of type `int` anyway? Don't you have to cast these
> objects over and over again after manipulating them to write them back
> into that array or for some other manipulations with these smaller
> types objects?

Yes you will have to cast them back.  Casting often translates to a
no-op or just a single instruction in the machine code; you just write
part of a 32-bit register back to memory instead of the whole thing, and
this automatically truncates the value to the narrow int.

The general advice is, perform computations with int or wider, then
truncate when writing back to storage for storage efficiency. So
generally you wouldn't cast the value to short/byte until the very end
when you're about to store the final result back to the array.  At that
point you'd probably also want to do a range check to catch any
potential overflows.

> Some people say that these promoting and casting operations in summary
> may have an even slower overall effect than simply using int, so I'm
> kind of confused about the use cases of these data types... (I think
> that my misunderstanding comes from not knowing how things happen at a
> slightly lower level of abstractions, like which operations require
> memory allocation, which do not, etc. Maybe some resource
> recommendations on that?) Thanks!

I highly recommend taking an introductory course to assembly language,
or finding a book / online tutorial on the subject.  Understanding how
the machine actually works under the hood will help answer a lot of
these questions, even if you'll never actually write a single line of
assembly code.

But in a nutshell: integer data types do not allocate, unless you
explicitly ask for it (e.g. `int* p = new int;` -- but you almost never
want to do this). They are held in machine registers or stored on the
runtime stack, and always occupy a fixed size, so almost no memory
management is needed for them. (Which is also why they're preferred when
you don't need anything more fancy, because they're also super-fast.)
Promoting an int takes at most 1 machine instruction, or, in the case of
unsigned values, sometimes zero instructions. Casting back to a narrow
int is often a no-op (the subsequent code just ignores the upper bits).
The performance difference is negligible, unless you're doing expensive
things like range checking after every operation (generally you don't
need to anyway, usually it's sufficient to check range at the end of a
computation, not at every intermediate step -- unless you have reason to
believe that an intermediate step is liable to overflow or wrap around).

T

-- 
People who are more than casually interested in computers should have at
least some idea of what the underlying hardware is like. Otherwise the
programs they write will be pretty weird. -- D. Knuth