GitHub Copilot is starting to produce D code that can almost pass code review
Petar
Petar
Sun Apr 3 22:19:43 UTC 2022
IIRC, a while ago, there was a post on the newsgroup by someone
asking if GH Copilot works with D. Here's my limited experience
so far.
I've been using the Copilot VSCode extension for a few months at
work, mostly with TypeScript and Go, and lately it has been
producing surprisingly (*) good results. Back when I started
using it, I tried it out with D, but quickly disabled it as I
found its suggestions way off and too distracting.
Recently I decided to give it another try for small D program and
I got surprisingly good results. Here's one example:
Input:
```d
SemVer[] getGitHubRepoTags(string repo, bool includePre|
```
Output (everything after the cursor, denoted by `|` above, was
suggested by Copilot):
```d
SemVer[] getGitHubRepoTags(string repo, bool includePrereleases =
false)
{
auto url = "https://api.github.com/repos/" ~ repo ~ "/tags";
auto tags = httpGet(url).readText.parseJSON.array;
return tags
.map!(tag => SemVer(tag.object.byPair.get("name")))
.filter!(x => x.isValid && (includePrereleases ||
!x.isPrerelease))
.array;
}
```
Things to note:
* On a purely syntactical level, the code is grammatically
correct - parenthesis are properly balanced, all statements end
with semicolon, and it even decided to group with parenthesis the
`includePrereleases || !x.isPrerelease` subexpression
* Apparently, there are enough examples of how to get the tags
for a GH repo that it got it right on the first try. I had to
look up the docs to verify that the REST API path format was
indeed correct.
* For some reason it insists on suggesting `httpGet`, instead of
simply `get` (from `std.net.curl`). I guess `get` is too generic
for its taste :D
* I still haven't seen suggestions containing function local
imports. My guess is that's because D is relatively unique
compared to most other languages, and is not well-represented in
the dataset Copilot is being trained on.
* While at the beginning, its suggestions mostly resembled
snippets from JavaScript or Python code, and for example it used
to suggest `+` (instead of `~`) for string concatenation, after a
while started to use `~` more consistently.
* Same for `map` and `filter` - in earlier parts of the program
Copilot used to suggest passing the lambda as a runtime parameter
(as in JavaScript), but after it saw a few examples in my code,
it finally started to consistently use the D template args syntax
* After a while it started suggesting `.array` here and there in
range pipelines
* For now, the suggestions I get involving slicing mostly use the
`.substr` function (most likely borrowed from a JS program), so
apparently, it hasn't seen enough `[start .. end]` expressions in
my code.
* Amusingly enough, even though DCD ought to be in a much more
advantaged position (it has an actual D parser, knows about my
imports paths, etc.), it gets beaten by Copilot pretty easily
both in terms of speed, usefulness and likelihood of even
attempting to produce any suggestion (e.g. DCD gives up at the
first sight of a UFCS pipeline, while those are bread and butter
for Copilot.
---
All in all, don't expect any wonders. Almost all suggestions of
non-trivial size will contain mistakes. But from time to time it
can pleasantly surprise you. It's funny when after a long day
Copilot starts to put `!` before `(some => lambda)` more
consistently than you :)
---
P.S. I'm leaving the semantic mistakes in the suggestion for the
fellow humans here to find :P
(*) That is to say surprising for me, considering I was expecting
it to produce pure gibberish, as it has no semantic understanding
of neither the programming language, nor the problem domain.
More information about the Digitalmars-d
mailing list