The Trouble with MonoTimeImpl (including at least one bug)
Forest
forest at example.com
Tue Apr 9 19:22:47 UTC 2024
On Wednesday, 3 April 2024 at 00:09:25 UTC, Jonathan M Davis
wrote:
> Well, it would appear that the core problem here is that you're
> trying to use MonoTime in a way that was not designed for or
> was even thought of when it was written.
I believe you. I think what I've been trying is reasonable,
though, given that the docs and API are unclear and the source
code calls a platform API that suggests I've been doing it right.
Maybe this conversation can lead to improvements for future users.
> In principle, ticks is supposed to be a number from the system
> clock representing the number of ticks of the system clock,
> with the exact number and its meaning being system-dependent.
> How often the number that you can get from the system is
> updated was not considered at all relevant to the design, and
> it never occured to me to call the number anything other than
> ticks, because in principle, it represents the current tick of
> the system clock when the time was queried. Either way, it's
> purely a monotonic timestamp. It's not intended to tell you
> anything about how often it's updated by the system.
The use of "ticks" has been throwing me. I am familiar with at
least two common senses of the word:
1. A clock's basic step forward, as happens when a mechanical
clock makes a tick sound. It might take a fraction of a second,
or a whole second, or even multiple seconds. This determines
clock resolution.
2. One unit in a timestamp, which determines timestamp
resolution. On some clocks, this is the same as the first sense,
but not on others.
From what you wrote above, I *think* you've generally been using
"ticks" in the second sense. Is that right? [Spoiler: Yes, as
stated toward the end of your response.]
If so, and if the API's use of "ticks" is intended to be that as
well, then I don't see why ticksPerSecond() is calling
clock_getres(), which measures "ticks" in the first sense of the
word. (This is my reading of the glibc man page, and is confirmed
in the test program I wrote for issue #24446.)
> Unfortunately, other POSIX systems don't have functions that
> work that way. [...] So, instead of getting the ticks of the
> system clock, you get the duration in nanoseconds. So, to make
> that fit the model of ticks and ticks-per-second, we have to
> convert that to ticks. For the purposes of MonoTime and how it
> was designed to be used, we could have just always made it
> nanoseconds and assumed nanosecond resolution for the system
> clock, but instead, it attempts to get the actual resolution of
> the system's monotonic clock and convert the nanoseconds back
> to that.
Ah, so it turns out MonoTime is trying to represent "ticks" in
the first sense (clock steps / clock resolution). That explains
the use of clock_getres(), but it's another source of confusion,
both because the API doesn't include anything to make that
conversion useful, and because ticksPerSecond() has that
hard-coded value that sometimes renders the conversion incorrect
(issue #24446).
> As for the oddities with how ticksPerSecond is set when a weird
> value is detected, I did some digging, and it comes from two
> separate issues.
>
> The first - https://issues.dlang.org/show_bug.cgi?id=16797 -
> has to do with how apparently on some Linux systems (CentOS
> 6.4/7 were mentioned as specific cases), clock_getres will
> report 0 for some reason, which was leading to a division by
> zero.
Curious. The clock flagged in that bug report is
CLOCK_MONOTONIC_RAW, which I have never used. I wonder, could
clock_getres() have been reporting 0 because that clock's
resolution was finer than that of the result type? Or, could the
platform have been determining the result by sampling the clock
at two points in time so close together that they landed within
the same clock step, thereby yielding a difference of 0? In
either case, perhaps the platform code has been updated since
that 2013 CentOS release; it reported 1 when I tried it on my
Debian system today.
> As for the other case, it looks like that was an issue that
> predates MonoTime and was originally fixed in TickDuration. The
> original PR was
>
> https://github.com/dlang/druntime/pull/88
>
> The developer who created that PR reported that with
> clock_getres, some Linux kernels were giving a bogus value that
> was close to 1 millisecond when he had determined that (on his
> system at least) the actual resolution was 1 nanosecond.
I disagree with that developer's reasoning. Why should our
standard library override a value reported by the system, even if
the value was surprising? If the system was reporting 1
millisecond for a good reason, I would want my code to use that
value. If it was a system bug, I would want it confirmed by the
system maintainers before meddling with it, and even then, I
would want any workaround to be in my application, not library
code where the fake value would persist long after the system bug
was fixed.
> Honestly, at this point, I'm inclined to just make it so that
> ticksPerSecond is always nanoseconds on POSIX systems other
> than Mac OS X. That way, we're not doing some math that is
> pointless if all we're trying to do is get monotonic timestamps
> and subtract them. It should also improve performance slightly
> if we're not doing that math.
I think that makes sense. The POSIX API's units are defined as
nanoseconds, after all, so treating them as such would make the
code both correct and easier to follow.
If we eventually discover APIs or docs on the other platforms
that give us clock resolution (in either timestamp units or
fractions of a second) like clock_getres() does on POSIX, then it
could be exposed through a separate MonoTime method.
> If we can then add functions of some kind that give you the
> additional information that you're looking for, then we can
> look at adding them.
Yes, we're thinking along the same lines. :)
Thanks for the thoughtful response.
Forest
More information about the Digitalmars-d
mailing list