`@safe` by default. What about `@pure` and `immutable` by default?

Jonathan M Davis newsgroup.d at jmdavisprog.com
Thu Apr 18 08:53:37 UTC 2019


On Wednesday, April 17, 2019 5:39:02 PM MDT Meta via Digitalmars-d wrote:
> On Tuesday, 16 April 2019 at 21:33:54 UTC, Jonathan M Davis wrote:
> > On Monday, April 15, 2019 9:59:38 PM MDT Mike Franklin via
> >
> > Digitalmars-d wrote:
> >> I think I may have found a simple migration path to @safe by
> >> default.  I'm still thinking it through, but if I can justify
> >> it, I will write a DIP.
> >>
> >> `@safe` by default is a no-brainer in my opinion, but `pure`
> >> and `immutable` by default are less obvious.  With the
> >> aforementioned potential DIP, I have an opportunity to correct
> >> purity and mutability defaults as well...but is that something
> >> we want to do?
> >>
> >> Can anyone save me some trouble and articulate why it would be
> >> bad to have `pure` and/or `immutable` by default?
> >
> > if you have a bunch of code, and it turns out that you need to
> > do something in it that isn't pure, you'd be screwed unless you
> > go and mark a ton of code with impure (or whatever the opposite
> > of pure would be). It's not like you can just opt-out in the
> > middle like you can with @safe by using @trusted to use @system
> > code.
>
> That's a good point that I almost always forget when we talk
> about making these attributes the default. To follow the
> safe/trusted/system model we'd need something like
> pure/almostPure/impure.

pure is binary in nature, and backdoors with it are seriously problematic.
It would be more accurate at this point to call pure @noglobal, because the
key thing is that a function is only able to access data through its
function arguments. It can't access any kind of globals except through those
function arguments, which means that basic stuff like I/O and caching tend
to not work with it. The pure functional stuff that we can get in D then
stems from the assumptions that the compiler is able to make based on what
it can determine from the function arguments and the knowledge that all of
the function's data comes from the function arguments. In the extreme case,
that can allow for function call elision (though that's pretty rare), but
more commonly, it allows for stuff like implicitly casting a function's
return value to immutable when the compiler can determine that the memory
for it has to be unique (which greatly simplifies constructing immutable
objects). At a macro-level, all pure/@noglobal really does is make it so
that when you look at a function signature, you know that it's not grabbing
data from anywhere but the function arguments, and even that is frequently
not very informative, because complex objects can do stuff like access
global variables via stored pointers - something which reduces how
informative the attribute is while not really making it easy to access
globals when you actually need to. So, I'd honestly argue that having pure
all over your entire code base would be far more detrimental than
beneficial. It just doesn't provide great benefits at the macro level -
mostly just within specific pieces of code that can actually take advantage
of what the compiler can do based on pure/@noglobal.

But ultimately, what the compiler needs to know when it does anything with
pure/@noglobal is that there is no way that the function can access anything
except via its arguments. All of its assumptions and optimizations stem from
that. So, having _any_ kind of backdoor for that like you would with an
@trusted equivalent destroys the guarantee - just like having a backdoor for
const that allows for mutating a const object would destroy the compiler's
ability to know that const data hasn't changed and thus would make const
pretty meaningless as far as compiler guarantees go.

With @safe, we could conceivably treat main as @safe and then _require_ that
any code that involves @system then be verified by the programmer as @safe
and marked with @trusted (it would be really annoying for anyone wanting to
avoid caring about @safe, but there's no technical reason why it's a problem
- just the huge risk that programmers will start slapping @trusted all over
the place when they haven't actually verified the code but want the compiler
to shut up so that they can get their work done). That isn't the same for
pure at all. If main were marked with pure, then it musn't access global
variables anywhere, and conceptually, running main multiple times with the
same data would then always result in exactly the same result, because you
can't pass out any other results via the function arguments to main and
can't access anything not provided to main unless it was created within main
(so, no I/O).

There _are_ rare cases where a piece of code that isn't technically
@noglobal is actually able to follow the compiler's guarantees (e.g.
std.datetime's LocalTime() is conceptually pure because it always returns
exactly the same value every time it's called, but it has to create that
value the first time that it's called, because it was determined
unacceptable to have static constructors in Phobos - and that means casting
to pure). But such functions are extremely rare and have to be done very
carefully. And then there's the mess that's pureMalloc. There's been _tons_
of arguments over whether it's safe to have it, because there are real risks
that it's going to be elided, and even people who are very knowledgeable
about D have been having a hard time agreeing on what's going on there and
what the compiler will or won't do. And all of that has to do with ensuring
that something can be safely treated as pure when it isn't technically pure.
It's _not_ something that your average programmer should even be considering
doing. So, stuff like assumePure or an @trusted version of pure would be
incredibly risky.

> Weak purity also might be able to provide some partial respite:
>
> int pure1(int n, ref IOWrapper io) pure
> {
>      return pure2(n.to!string(), io); //The chain continues
> }
>
> int pure2(string s, ref IOWrapper io) pure
> {
>      //Can't do this because writeln is impure
>      //writeln("The value of s is ", s);
>
>      io.writeln("The value of s is ", s);
>
>      int result;
>      //Do some other work
>
>      return result;
> }
>
> struct IOWrapper
> {
>      string[] writeQueue;
>
>      void writeln(Args...)(Args args) pure
>      {
>          foreach (arg; args)
>              writeQueue ~= args.to!string();
>          writeQueue ~= '\n';
>      }
>
>      void writeAll()
>      {
>          foreach (msg; writeQueue)
>              writeln(msg);
>      }
> }
>
> And if you don't like that, you can instead accept IOWrappers by
> value and return them along with the result of your calculation
> in a Tuple. It's not pretty, but it works.

You're basically getting into monads, which are a complex and difficult
topic. It's how languages like Haskell are able to be pure while still
having I/O. It's technically possible, but it's a royal pain, and most
people end up having a very hard time understanding it. It's also not how
much code is written unless programmers are forced to write that way. We do
have something similar with output ranges, which allows for specific pieces
of code to be written in a way that could involve I/O without explicitly
involving I/O, but such code is still frequently not actually pure, because
the output range will often output the data as it goes along rather than
building a string to output at the end. Templates deal with that though, so
the code using the output range is usually able be pure if the output range
itself is pure.

D's pure is a great tool in sections of code that are specifically written
for it in mind, and we do have some language features and idioms that make
it easier for code to be pure that might not be otherwise, but trying to
make entire programs pure really doesn't make sense - not for what is
supposed to be a multiparadigm language. It's just way too restrictive. So,
improving the tools for it and doing a better job of making sure that stuff
is pure when it can be would be useful, but IMHO, trying to make it the
default would be far too dogmatic and a huge mistake.

- Jonathan M Davis





More information about the Digitalmars-d mailing list