Treating the abusive unsigned syndrome
Denis Koroskin
2korden at gmail.com
Wed Nov 26 18:20:50 PST 2008
27.11.08 в 03:46 Sean Kelly в своём письме писал(а):
> Andrei Alexandrescu wrote:
>> Sean Kelly wrote:
>>> Don wrote:
>>>>
>>>> Although it would be nice to have a type which was range-limited,
>>>> 'uint' doesn't do it. Instead, it guarantees the number is between 0
>>>> and int.max*2+1 inclusive. Allowing mixed operations encourages
>>>> programmers to focus the benefit of 'the lower bound is zero!' while
>>>> forgetting that there is an enormous downside ('I'm saying that this
>>>> could be larger than int.max!')
>>>
>>> This inspired me to think about where I use uint and I realized that I
>>> don't. I use size_t for size/length representations (largely because
>>> sizes can theoretically be >2GB on a 32-bit system), and ubyte for
>>> bit-level stuff, but that's it.
>> For the record, I use unsigned types wherever there's a non-negative
>> number involved (e.g. a count). So I'd be helped by better unsigned
>> operations.
>
> To be fair, I generally use unsigned numbers for values that are
> logically always positive. These just tend to be sizes and counts in my
> code.
>
>> I wonder how often these super-large arrays do occur on 32-bit systems.
>> I do have programs that try to allocate as large a contiguous matrix as
>> possible, but never sat down and tested whether a >2GB chunk was
>> allocated on the Linux cluster I work on. I'm quite annoyed by this
>> >2GB issue because it's a very practical and very rare issue in a weird
>> contrast with a very principled issue (modeling natural numbers).
>
> Yeah, I have no idea how common they are, though my guess would be that
> they are rather uncommon. As a library programmer, I simply must assume
> that they are in use, which is why I use size_t as a matter of course.
>
>
> Sean
If they can be more than 2Gb, why can't they be more than 4GB? It is
dangerous to assume that they won't, that's why uint is dangerous. You
exchange one additional bit of information for safety, this is wrong.
Soon enough we won't use uints the same way we don't use ushorts (I should
have asked if anyone uses ushort these day first, but there is so little
gain to use ushort as opposed to short or int that I consider it
impractical). 64bit era will give us 64bit pointers and 64 bit counters.
Do you think you will prefer ulong over long for an additional bit? You
really shoudn't.
My proposal
Short summary:
- Disallow bitwise operations on both signed types and unsigned types,
allow arithmetic operations
- Discourage usage of unsigned types. Introduce bits8, bits16, bits32 and
bits64 as a replacement
- Disallow arithmetic operations on bits* types, allow bitwise operations
on them
- Disallow mixed-type operations (compare, add, sub, mul and div)
- Disallow implicit casts between all types
- Use int and long (or ranged types) for length and indices with runtime
checks (a.length-- is always dangerous no mater what CT checks you will
make).
- Add type constructors for int/uint/etc: "auto x = int(int.max + 1);"
throws at run-time
The two most common uses of uints are:
0) Bitfields or masks, packed values and hexademical constants (bitfields
later on)
1) Numbers that can't be negative (counters, sizes/lengths etc)
Bitfields
Bitfields are handy, and using of an unsigned type over a signed is surely
preferable. Most common operations on bitfields are bitwise AND, OR,
(R/L)SHIFT and XOR. You shouldn't substruct from or add to them, it is an
error in most cases. This is what new bits8, bits16, bits32 and bits64
types should be used for:
bits32 argbColor;
int alphaShift = 24; // any type here, actually
// shift
bits32 alphaMask = (0xFF << alphaShift); // 0xFF is of type bits8
auto value2 = value1 & mask; // all 3 are of type bits*
// you can only shift bits, result is in bits, too, i.e. the following is
incorrect:
int i = -42;
int x = (i << 8); // An error
// 1) can't shift value of type int
// 2) can't assign valus of type bits32 to variable of type int
// ubyte is still handy sometimes (color should belong to [0..255] range)
auto red = (argbColor & alphaMask) >> alphaShift; // result is in bits32,
use explicit cast to convert it to target data type:
ubyte red = cast(ubyte)((argbColor & alphaMask) >> alphaShift);
// Alternatively:
ubyte alpha = ubyte((argbColor & alphaMask) >> alphaShift);
Type constructor throws an error if source value (which is of type bits32
in this example) can't be stored in ubyte. This might be a replacement for
signed/unsigned methods.
int i = 0xFFFFFFFF; // an error, can't convert value of type bits32 to
variable of type int
int i = int.max + 1; // ok
int i = int(int.max + 1); // an exception is raised at runtime
int i = 0xABCD - 0xDCBA; // not allowed. Add explicit casts
auto u = cast(uint)0xABCD - cast(uint)0xDCBA; // result type is uint, no
checks for overflow
auto i = cast(int)0xABCD - cast(int)0xDCBA; // result type is int, no
checks for overflow
auto e = cast(uint)0xABCD - cast(int)0xDCBA; // an error, can't substruct
int from uint
// type ctors in action:
auto i = int(cast(int)0xABCD - cast(int)0xDCBA); // result type is int, an
exception on overflow
auto u = int(cast(uint)0xABCD - cast(uint)0xDCBA); // same here for uint
Non-negative values
Just use int/long. Or some ranged type ([0..short.max], [0..int.max],
[0..long.max]) could be used as well. A library type, perhaps. Let's call
it nshort/nint/nlong. It should have the same set of operations as
short/int/long but makes additional checks. Throws on under- and overflow.
int x = 42;
nint nx = x; // ok
nx = -x; // throws
nx = int.max; // ok
++nx; // throws
nx = 0;
--nx; // throws
nx = 0;
nint ny = 42;
nx = ny; // no checking is done
int y = ny; // no checking is done, either
short s = ny; // error, cast needed
short s = cast(short)ny; // never throws
short s = short(ny); // might throw
More information about the Digitalmars-d
mailing list