D enters Tiobe top 20

Sat Nov 9 17:16:40 UTC 2019

On Saturday, 9 November 2019 at 14:49:32 UTC, Doc Andrew wrote:
> As a stop-gap we might be able to use an
> @assume UDA, but I do think "assume" could be added as a 
> reserved keyword in the near future in anticipation...

True, I meant to refer to the concept, not syntax. Maybe the 
syntax for assume should be ugly enough to really stand out so 
that it is caught in reviews,  "@ASSUME" or "@__assume"... Dunno. 
That is a detail, though.

> I like the idea of having a compiler flag to treat assumes as 
> asserts, maybe even in release mode?

It could at least be useful in cases where you do exhaustive unit 
testing (test all combinations), then assume would be safe.

>> This allows you to get started working on the optimizer and 
>> figure out how to make good use of the extra facts.
>
> That's probably a much longer-term goal, but just being able to 
> move normal error-checking into contracts and depending on them 
> in release code is a big win IMO.

Well, not necessarily, some C++ compilers support it.  The 
problem is that you probably don't want to introduce too many 
auxiliary facts, as that would just slow down compilation with no 
gain (and perhaps even confuse a naive optimizer). Anyway, you 
should be able to add facts to LLVM by compiling a boolean 
expression and then assuming it to hold.

But maybe the language syntax should allow you to select which 
propositions you want to propagate to the optimizer. Not sure if 
LLVM can express quantifiers though (forall/exists).

> This might be more difficult to enforce without some tight 
> compiler integration. An attribute like @ensured or @certified 
> could be the "gold standard" of safe D code, which would 
> require a proof of no run-time errors and all contracts. 
> Without the proof, the compiler could throw an error for 
> functions with this attribute.

Well, that might be an issue, if the ensure-statement is meant to 
help optimization then it could simply be omitted. If it is meant 
to help correctness then it could be translated into an assert if 
it isn't verified.

But I guess one will have to distinguish the two cases somehow. 
Which is basically why I think trying to improve basic libraries 
with assume-statements and manual proofs is the best starting 
point. Then you get an idea for how it can affect optimization.

You probably also want to bake into a certificate trust-building 
meta information, such as which verification method, and who did 
it and where, so that developers can specify constraints like 
"must be verified by service X using verifier Y or Z".  Or open 
up for hand-verified code by developer Q or a specific 
organization (could be a URL to a public-key).

> We'd probably need a DIP to hash this out.

I will certainly read and comment on it, if you want to write 
one. You could start with specc'ing out a list of requirements?

> This might tie in with some of the work going on with 
> reproducible builds, but I love the idea of certs for known 
> good object files.

I actually meant source code only. You distribute the source 
code, but with certified "requires"/"assume"/"ensures" statements 
that is bound to the source code using cryptographic hashing and 
an online index (so that it can be checked or revoked).

I guess one issue is that it has to be a standalone library (e.g. 
only depend on a specific version of the standard library) for 
this to work. Otherwise you would have to tie it to a specific 
certified version of all the libraries you depend on.

But... I don't actually think that is a big deal, you typically 
would pre-verify basic libraries. You could use a simple 
concatenation for the hash with sorted filenames and some 
sentinel character sequence for separating files, e.g.:  
filename1, sep, file1, sep, filename2, sep, file2, sep, sep.

> Using something like the @certified (or @ensured, or @verified, 
> or whatever) attribute, we could just turn off run-time checks 
> in these functions if a proof exists, could be as simple as a 
> source.d.proof file with the function name (and hash?) in it.

I am actually not sure how this could be done well. Because you 
could have array accesses within a conditional expression.

Maybe you'd actually have to annotate the indexing with some new 
syntax to tell the compiler that the access does not need bounds 
checks as it has been proven already to hold... So I guess this 
is an area that needs to be fleshed out with many examples.

Anyway, it seems reasonable that a tool would take source code, 
verify it and emit an annotated version of the source code. Then 
less capable compilers could use the verified results.

Most users are not going to write verifiable code themselves.

> I'm not sure about this one. Usually assumes are big red flags, 
> they're the FV version of unsafe - "trust me, I know what I'm 
> doing", that sort of thing. I might have misunderstood your 
> intent here though.

I think so, if they are proven to hold by a tool/compiler then 
they hold. Dataflow analysis is sufficient, when it succeeds, but 
will fail in many cases at it is a weak approach. However, it is 
also much faster than a proper solver. Basically just an 
optimization for performance reasons.

But you also want to use assume manually sometimes when 
describing hardware registers (e.g. describing that bit 8-32 
always are zero), or when something has been proven by hand and 
has a significant performance impact.

> This is how SPARK works, even if you can't prove a (normal Ada) 
> function body, because of inline asm, or pointer arithmetic or 
> something, you can still specify pre and post-contracts on the 
> interface. I think there's a little bit of a @trusted feel to 
> it, since the post-conditions on a non-SPARK function become 
> sort of an assumption that the function does what it claims, 
> given the preconditions hold. I think SPARK may leave these as 
> run-time checks (normal Ada behavior, normal D _Debug_ 
> behavior), which you'd have to do to have any sort of 
> confidence in your proof.

Yes, for typical SPARK use cases that seems reasonable.

> This is where things get a little messy. Verified functions 
> calling non-verified functions with post-condition checks would 
> need these checks enabled even in release mode, I would think.

Yes, this is where different goals may lead to different 
designs... so the challenge is to come up with something that can 
both support correctness and optimization without being extremely 
complicated in terms of syntax.

> The cool thing is that D _today_ has all the syntax it needs to 
> start working on tooling for this with no language changes.

Sure, you could also just define a dummy "ensures", "requires", 
whatever… function that takes a boolean as a parameter. With an 
empty body, or an assert.

> a particular build. I'm not as concerned as some are with 
> proven builds and linking, but there may be high-integrity 
> requirements in the future where this sort of thing is 
> required. I'm not sure if it's a thing yet, maybe NASA does it, 
> but I'd be surprised if anybody else is yet.

There are papers on proof-carrying-assembly code...

Although the basic idea behind certification would be that you 
get to use the postconditions and assumes without having to run 
costly verification on your own machine.

You might even see a commercial service with very strong provers 
and effective hardware.

> I'm not totally familiar with the landscape, but generating 
> Why3ML from D seems like it would be one good way to start. You 
> get access to a lot of solvers for free, it seems like it's 
> well-documented and fairly easy to use.

Perhaps Why3ML is a good choice, but I would've started with 
Boogie.