Signed word lengths and indexes
Justin Spahr-Summers
Justin.SpahrSummers at gmail.com
Thu Jun 17 00:38:36 PDT 2010
On Thu, 17 Jun 2010 03:27:59 -0400, Kagamin <spam at here.lot> wrote:
>
> Justin Spahr-Summers Wrote:
>
> > This sounds more like an issue with file offsets being longs,
> > ironically. Using longs to represent zero-based locations in a file is
> > extremely unsafe. Such usages should really be restricted to short-range
> > offsets from the current file position, and fpos_t used for everything
> > else (which is assumably available in std.c.stdio).
>
> 1. Ironically the issue is not in file offset's signedness. You still hit the bug with ulong offset.
How so? Subtracting a size_t from a ulong offset will only cause
problems if the size_t value is larger than the offset. If that's the
case, then the issue remains even with a signed offset.
> 2. Signed offset is two times safer than unsigned as you can detect
> underflow bug (and, maybe, overflow).
The solution with unsigned values is to make sure that they won't
underflow *before* performing the arithmetic - and that's really the
proper solution anyways.
> With unsigned offset you get exception if the filesystem doesn't
> support sparse files, so the linux will keep silence.
I'm not sure what this means. Can you explain?
> 3. Signed offset is consistent/type-safe in the case of the seek function as it doesn't arbitrarily mutate between signed and unsigned.
My point was about signed values being used to represent zero-based
indices. Obviously there are applications for a signed offset *from the
current position*. It's seeking to a signed offset *from the start of
the file* that's unsafe.
> 4. Choosing unsigned for file offset is not dictated by safety, but by stupidity: "hey, I lose my bit!"
You referred to 32-bit systems, correct? I'm sure there are 32-bit
systems out there that need to be able to access files larger than two
gigabytes.
> I AM an optimization zealot, but unsigned offsets are plain dead
> freaking stupid.
It's not an optimization. Unsigned values logically correspond to disk
and memory locations.
More information about the Digitalmars-d
mailing list