Use of IA for PR - my POV
Vladimir Panteleev
thecybershadow.lists at gmail.com
Tue Feb 10 16:14:03 UTC 2026
On Monday, 9 February 2026 at 21:25:02 UTC, user1234 wrote:
> One tendency I have noticed recently in the D world is one guy
> that is very good with AI. Cybershadow. Already 5 or 6 PR, he
> masters the tools.
I guess I could post a few thoughts about AI / LLMs here if
people are interested.
It's not a new thing that I've been interested in the idea of
offloading menial work onto machines. DustMite was my first big
project in that vein - you define an oracle, then drop your xMLOC
codebase on it and go enjoy your weekend; then Digger to help
automate bisecting regressions with our multi-repo setup, so LLMs
are kind of in that vein, if applied properly.
The LLMs themselves were pretty much useless toys for a long time
when it came to writing code, and the vast majority of them still
are. I think even what you get today on e.g. chatgpt.com is going
to be underwhelming from many perspectives. However, it does seem
like there was a huge jump last year with Opus 4.5.
I've been experimenting with LLMs generating D and other code
throughout the last year. Just last August I was playing around
with the best model at the time - the results were, frankly,
depressing. I think I spent $200 in tokens for a development
process that I could have done much faster, prettier, more
correct, etc. At that point, it was clear that there was nothing
to be gained from agentic coding, at least for what I was doing.
Then, Opus 4.5 came out. I'm not sure if it really was an
objectively major breakthrough in capabilities, or if it merely
crossed some threshold that would quantify it as such, or if that
was just my perception, but for me it was the first model that
seemed actually ... useful.
- It could write non-trivial code - entire multi-module programs
- that actually worked on first try.
- It no longer regularly made stupid mistakes that no human would
ever do.
- It wrote correct D! Without hallucinating features!
But what shocked me the most:
- It knew about my personal D libraries! Somehow, my personal ae
D library was in its training data set, with sufficient coverage
that it could even use some parts of it blind!
- Sometimes, it even wrote better D code than me! It would use
patterns that I was not aware of or thought of!
To me, this was mind-blowing, and turned my whole programming
life upside down.
Since then, I decided to do an experiment: Could I just use it
for everything? Just stop writing code, and use it to do the code
writing?
Would this make me more productive or less? Would I eventually
forget how to code? Would I eventually get buried by the big pile
of slop and broken code that I don't understand?
I didn't know the answers but I was really fascinated to try and
find out. So, I got a $200/mo Claude Code Max subscription and
set out with that self-imposed constraint.
Here's what I can tell you so far:
- I'm probably not as good as I was at hammering out code with a
keyboard as I was last year. But, I do feel like I'm now much
better at code review, multitasking, and task switching. Since
the whole idea of AI is to make the bot work on your behalf, you
can multi-track several projects (or multiple aspects of a
project), or simply enjoy your hobby while checking in on the bot
every half hour. Like with DustMite, if you use AI but then stare
at the screen while it's working, you're Doing It Wrong.
- The bot obviously still makes mistakes. But the mistakes it
makes are different than the kind of mistakes a human would make:
no typos, no copy-paste errors, no "I forgot to add this one
line", no "I used the variable `foo` from the argument list
instead of the variable `foo` from the local scope".
- On the other hand, the bot is terrible at designing. The APIs
are bad, the patterns are bad, the structure is bad. You still
need to think ahead about what you want to build and what shape
each part should be in. However, you now have a lot more
cognitive bandwidth to focus on this exclusively.
- Obviously you do need to read and understand the code it
writes...
- ...unless it's for one-off throwaway scripts, which are now
really really easy to produce! You can script anything easily
with zero investment, which is a big help sometimes.
- Other things it's good at:
- Bug hunting - drop a test case on it, come back half an hour
later, and it will likely have found the root cause (and maybe
even an initial patch for it).
- Code research - have a technical question about a project and
want a precise answer? `git clone` the GitHub repo, run the
agent, and ask your question - you'll get an answer with
citations to exact line numbers.
- Speculative refactoring - have a complicated code base but
don't want to invest the time in a refactoring that may make the
code simpler or it may make it an even bigger mess? The bots are
very good at mechanical code transformations, so you can give it
10 refactoring ideas and just leave it overnight.
- Writing test cases, but everyone knows this one already.
- In terms of getting things done, I do find myself to be more a
lot more productive! Certainly not in terms of time, but
definitely creative energy. I've picked up and even wrapped up a
lot of projects from my backlog. I do miss some bugs in review
(or sometimes am just too lazy to review the code), so the output
quality is maybe not the same as what I would have churned out by
hand, but I'll definitely take a 20% quality hit for a 500%
productivity gain.
On contributing to D:
I think so far I used Claude to write patches for Phobos,
Druntime, and DMD. In order:
- Phobos: For me these are very easy to review. I'm confident in
their quality, so it's just a time saver.
- Druntime: These have been mainly translations of C headers to
D. LLMs are good at these, so the main thing to watch out for is
that the translation follows our conventions.
And then there's the compiler, DMD.
So, here's the thing. Maybe my perspective is off, but my point
of view is: in order to understand and be able to meaningfully
review patches to the compiler, you must be a compiler developer.
And, you do not simply become a compiler developer.
As much as I wish I could understand and help out with all parts
of D, I need to pick my battles. In my mind, D compiler hackers
are the most elite of the elite D developers. I bow to them and
plead for their mercy as they consider my bug reports and patches.
This puts me in a difficult situation every time I run into a
blocking compiler bug. I could:
1. Reduce the bug to a test case, file an issue, and watch as
likely absolutely nothing happens for years (blocker or not,
regression or not). Understandable, since compiler bugs are hard,
compiler development is hard, and nothing in life is free.
2. Give up on some of my personal projects and invest into
becoming a D compiler developer instead.
3. Tuck my tail and try to work around it in my code base, giving
up on my perfect envisioned design.
4. [NEW!] Ask the bot to draft a patch, which it often ultimately
succeeds at doing (at least to the point of getting the test
suite to pass). Now, instead of filing a bug report, I can file a
bug report with a machine-generated patch attached (in the form
of a pull request), which might be total garbage - but at least
it starts a discussion! I'm not sure how the compiler hackers
feel about this, though, but I've always tried to be up-front
about the provenance of the patches and so far I have not been
asked to stop.
What should we do about this? Should we use it more? Should we
use it less? I don't know. There's definitely valid reasons -
ethical, practical, legal, financial - to avoid using it, as have
been mentioned here and elsewhere. But it also seems genuinely
useful in at least some situations, and no one knows what the
future holds.
Anyway, one point I wanted to make is that if we're talking about
the quality/implications/etc. of AI, we should definitely be
clear about which specific model we're talking about. Things have
been improving very rapidly and there's a lot of variation in
what you might experience recently or even today.
More information about the Digitalmars-d
mailing list