GitHub Copilot is starting to produce D code that can almost pass code review

Petar Petar
Sun Apr 3 22:19:43 UTC 2022


IIRC, a while ago, there was a post on the newsgroup by someone 
asking if GH Copilot works with D. Here's my limited experience 
so far.

I've been using the Copilot VSCode extension for a few months at 
work, mostly with TypeScript and Go, and lately it has been 
producing surprisingly (*) good results. Back when I started 
using it, I tried it out with D, but quickly disabled it as I 
found its suggestions way off and too distracting.

Recently I decided to give it another try for small D program and 
I got surprisingly good results. Here's one example:

Input:

```d
SemVer[] getGitHubRepoTags(string repo, bool includePre|
```

Output (everything after the cursor, denoted by `|` above, was 
suggested by Copilot):

```d
SemVer[] getGitHubRepoTags(string repo, bool includePrereleases = 
false)
{
     auto url = "https://api.github.com/repos/" ~ repo ~ "/tags";
     auto tags = httpGet(url).readText.parseJSON.array;

     return tags
         .map!(tag => SemVer(tag.object.byPair.get("name")))
         .filter!(x => x.isValid && (includePrereleases || 
!x.isPrerelease))
         .array;
}
```

Things to note:

* On a purely syntactical level, the code is grammatically 
correct - parenthesis are properly balanced, all statements end 
with semicolon, and it even decided to group with parenthesis the 
`includePrereleases || !x.isPrerelease` subexpression

* Apparently, there are enough examples of how to get the tags 
for a GH repo that it got it right on the first try. I had to 
look up the docs to verify that the REST API path format was 
indeed correct.

* For some reason it insists on suggesting `httpGet`, instead of 
simply `get` (from `std.net.curl`). I guess `get` is too generic 
for its taste :D

* I still haven't seen suggestions containing function local 
imports. My guess is that's because D is relatively unique 
compared to most other languages, and is not well-represented in 
the dataset Copilot is being trained on.

* While at the beginning, its suggestions mostly resembled 
snippets from JavaScript or Python code, and for example it used 
to suggest `+` (instead of `~`) for string concatenation, after a 
while started to use `~` more consistently.

* Same for `map` and `filter` - in earlier parts of the program 
Copilot used to suggest passing the lambda as a runtime parameter 
(as in JavaScript), but after it saw a few examples in my code, 
it finally started to consistently use the D template args syntax

* After a while it started suggesting `.array` here and there in 
range pipelines

* For now, the suggestions I get involving slicing mostly use the 
`.substr` function (most likely borrowed from a JS program), so 
apparently, it hasn't seen enough `[start .. end]` expressions in 
my code.

* Amusingly enough, even though DCD ought to be in a much more 
advantaged position (it has an actual D parser, knows about my 
imports paths, etc.), it gets beaten by Copilot pretty easily 
both in terms of speed, usefulness and likelihood of even 
attempting to produce any suggestion (e.g. DCD gives up at the 
first sight of a UFCS pipeline, while those are bread and butter 
for Copilot.

---

All in all, don't expect any wonders. Almost all suggestions of 
non-trivial size will contain mistakes. But from time to time it 
can pleasantly surprise you. It's funny when after a long day 
Copilot starts to put `!` before `(some => lambda)` more 
consistently than you :)

---

P.S. I'm leaving the semantic mistakes in the suggestion for the 
fellow humans here to find :P

(*) That is to say surprising for me, considering I was expecting 
it to produce pure gibberish, as it has no semantic understanding 
of neither the programming language, nor the problem domain.


More information about the Digitalmars-d mailing list