Is RDTSC pure?
Jonathan M Davis
newsgroup.d at jmdavisprog.com
Sun Sep 14 02:50:42 UTC 2025
On Friday, September 12, 2025 3:48:49 AM Mountain Daylight Time IchorDev via Digitalmars-d-learn wrote:
> Recently I wanted to write a pure function that returns an
> unpredictable number, so I decided to use RDTSC (and any
> equivalent instruction for other CPU architectures) to do this,
> since the compiler allows RDTSC to be marked as `pure`.
> However, in the end I discarded this idea because I figured that
> a `pure` function should never return a different value with the
> same input; and doing so would surely break any applicable
> memoisation. Inline assembly isn't checked by the compiler, so I
> was essentially doing the same thing as misusing `@trusted`…
>
> Or so I thought. Today I remembered that `pureMalloc` exists,
> which surely doesn't follow these rules and would definitely not
> work when memoised. So how come it's still allowed to be `pure`
> just by resetting ERRNO? It can return a different value with the
> same input, so does that mean that using RDTSC is also `pure`?
pureMalloc is one of those functions which was probably a mistake precisely
because it's so easy to not take into account an assumption that the
compiler is going to make based on pure. As it is, we've had bugs (and still
have bugs) where some new behavior was added based on pure where it didn't
take into account other assumptions already made by the compiler, resulting
in the compiler doing the wrong thing in some cases (e.g. implicitly
converting a return type to immutable, because the compiler incorrectly
decided that it was not possible for the return value to have come from one
of the function's arguments, thereby resulting in immutable values being
mutated by mutable references to the data).
Honestly, I would strongly argue that unless you absolutely _need_ something
to be pure, you should not mess with pure - _especially_ if it involves any
situation where you have to decide whether an extern(C) function can be
treated as pure instead of the compiler figuring it out. There are just too
many factors to take into account with regards to what the compiler will or
won't do in order to correctly determine whether it's going to work to treat
the function as pure.
And even when you're completely relying on the compiler to determine pure,
it's too easy for there to be subtle bugs such that the main reason that
there aren't more bugs is likely that the compiler rarely actually does any
optimizations based on pure - which also means that you rarely actually get
any benefit from pure anyway. It's really only of value when you absolutely
must be certain that no mutable, global state is being accessed (e.g.
because you're doing something threading-related where it would matter). In
practice, such concerns are really a non-issue for the vast majority of
code, and using pure just limits the ability to refactor the code later
(especially if it's part of a public API) - and of course, if you're adding
pure extern(C) functions into the mix, then that's adding an additional
layer of risk.
Either way, aside from issues related to memory, if you have a function
which isn't guaranteed to return exactly the same value every single time
that it's called with the same arguments, you shouldn't mark it as pure.
Obviously, for extern(D) functions, that's supposed to be caught for you,
but with extern(C), you shouldn't be using pure if you can't guarantee the
same result every time that the same arguments are passed.
Memory gets weird on top of that because of the desire to use the GC with
pure functions and the argument that two objects with the same value are the
same even though they're different places in memory (so whether they're
really the same or not depends on what you're looking to guarantee). But
even when a pure function returns a newly allocated object, every time that
that function is called with the same arguments, the resulting object needs
have the same value as any of the other calls. So, while the objects may not
be the same objects in memory, they need to have the same value.
pureMalloc was added because of the desire to do the same with malloc-ed
memory, but IMHO, it was a very questionable choice, and as it is, there was
a ton of arguing over it before it got merged precisely because of how
tricky it is to be absolutely sure that it's going to do the right thing.
Personally, I don't think that it should have been merged, and it would not
surprise me in the least if the compiler did the wrong thing with pureMalloc
some of the time.
However, if you have a function which is going to return an unpredictable
number, then that's fundamentally not pure, because that means that it's not
going to give the same result every time it's given the same arguments.
Newly allocated memory gets a pass based on the argument that the result
pointed to by the memory has the same value even if the pointer or reference
differs, so in principle, it's the same value. However, that logic does not
at all apply to a function that returns an unpredictable number.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list