byte and short data types use cases

Sat Jun 10 21:58:12 UTC 2023

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
> On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:
>> On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:
>>
>> If you have four ubyte variables in a struct and then
>> an array of them, then you are getting optimal memory usage.
>
> Is this some kind of property? Where can I read more about this?
>
> So you can optimize memory usage by using arrays of things 
> smaller than `int` if these are enough for your purposes, but 
> what about using these instead of single variables, for example 
> as an iterator in a loop, if range of such a data type is 
> enough for me? Is there any advantages on doing that?

A couple of other important use-cases came to me. The first one 
is unicode which has three main representations, utf-8 which is a 
stream of bytes each character can be several bytes, utf-16 where 
a character can be one or rarely two 16-bit words, and utf32 - a 
stream of 32-bit words, one per character. The simplicity of the 
latter is a huge deal in speed efficiency, but utf32 takes up 
almost four times as memory as utf-8 for western european 
languages like english or french. The four-to-one ratio means 
that the processor has to pull in four times the amount of memory 
so that’s a slowdown, but on the other hand it is processing the 
same amount of characters whichever way you look at it, and in 
utf8 the cpu is having to parse more bytes than characters unless 
the text is entirely ASCII-like.

The second use-case is about SIMD. Intel and AMD x86 machines 
have vector arithmetic units that are either 16, 32 or 64 bytes 
wide depending on how recent the model is. Taking for example a 
post-2013 Intel Haswell CPU, which has 32-byte wide units, if you 
choose smaller width data types you can fit more in the vector 
unit - that’s how it works, and fitting in more integers or 
floating point numbers of half width means that you can process 
twice as many in one instruction. On our Haswell that means four 
doubles or four quad words, or eight 32-bit floats or 32-bit 
uint32_ts, and similar doubling s’s for uint16_t. So here width 
economy directly relates to double speed.