Use of IA for PR - my POV

Vladimir Panteleev thecybershadow.lists at gmail.com
Tue Feb 10 16:14:03 UTC 2026


On Monday, 9 February 2026 at 21:25:02 UTC, user1234 wrote:
> One tendency I have noticed recently in the D world is one guy 
> that is very good with AI. Cybershadow. Already 5 or 6 PR, he 
> masters the tools.

I guess I could post a few thoughts about AI / LLMs here if 
people are interested.

It's not a new thing that I've been interested in the idea of 
offloading menial work onto machines.  DustMite was my first big 
project in that vein - you define an oracle, then drop your xMLOC 
codebase on it and go enjoy your weekend; then Digger to help 
automate bisecting regressions with our multi-repo setup, so LLMs 
are kind of in that vein, if applied properly.

The LLMs themselves were pretty much useless toys for a long time 
when it came to writing code, and the vast majority of them still 
are. I think even what you get today on e.g. chatgpt.com is going 
to be underwhelming from many perspectives. However, it does seem 
like there was a huge jump last year with Opus 4.5.

I've been experimenting with LLMs generating D and other code 
throughout the last year.  Just last August I was playing around 
with the best model at the time - the results were, frankly, 
depressing. I think I spent $200 in tokens for a development 
process that I could have done much faster, prettier, more 
correct, etc. At that point, it was clear that there was nothing 
to be gained from agentic coding, at least for what I was doing.

Then, Opus 4.5 came out. I'm not sure if it really was an 
objectively major breakthrough in capabilities, or if it merely 
crossed some threshold that would quantify it as such, or if that 
was just my perception, but for me it was the first model that 
seemed actually ... useful.

- It could write non-trivial code - entire multi-module programs 
- that actually worked on first try.
- It no longer regularly made stupid mistakes that no human would 
ever do.
- It wrote correct D! Without hallucinating features!

But what shocked me the most:

- It knew about my personal D libraries! Somehow, my personal ae 
D library was in its training data set, with sufficient coverage 
that it could even use some parts of it blind!
- Sometimes, it even wrote better D code than me! It would use 
patterns that I was not aware of or thought of!

To me, this was mind-blowing, and turned my whole programming 
life upside down.

Since then, I decided to do an experiment: Could I just use it 
for everything? Just stop writing code, and use it to do the code 
writing?

Would this make me more productive or less? Would I eventually 
forget how to code? Would I eventually get buried by the big pile 
of slop and broken code that I don't understand?

I didn't know the answers but I was really fascinated to try and 
find out. So, I got a $200/mo Claude Code Max subscription and 
set out with that self-imposed constraint.

Here's what I can tell you so far:

- I'm probably not as good as I was at hammering out code with a 
keyboard as I was last year. But, I do feel like I'm now much 
better at code review, multitasking, and task switching. Since 
the whole idea of AI is to make the bot work on your behalf, you 
can multi-track several projects (or multiple aspects of a 
project), or simply enjoy your hobby while checking in on the bot 
every half hour. Like with DustMite, if you use AI but then stare 
at the screen while it's working, you're Doing It Wrong.
- The bot obviously still makes mistakes. But the mistakes it 
makes are different than the kind of mistakes a human would make: 
no typos, no copy-paste errors, no "I forgot to add this one 
line", no "I used the variable `foo` from the argument list 
instead of the variable `foo` from the local scope".
- On the other hand, the bot is terrible at designing. The APIs 
are bad, the patterns are bad, the structure is bad. You still 
need to think ahead about what you want to build and what shape 
each part should be in. However, you now have a lot more 
cognitive bandwidth to focus on this exclusively.
- Obviously you do need to read and understand the code it 
writes...
- ...unless it's for one-off throwaway scripts, which are now 
really really easy to produce! You can script anything easily 
with zero investment, which is a big help sometimes.
- Other things it's good at:
   - Bug hunting - drop a test case on it, come back half an hour 
later, and it will likely have found the root cause (and maybe 
even an initial patch for it).
   - Code research - have a technical question about a project and 
want a precise answer? `git clone` the GitHub repo, run the 
agent, and ask your question - you'll get an answer with 
citations to exact line numbers.
   - Speculative refactoring - have a complicated code base but 
don't want to invest the time in a refactoring that may make the 
code simpler or it may make it an even bigger mess? The bots are 
very good at mechanical code transformations, so you can give it 
10 refactoring ideas and just leave it overnight.
   - Writing test cases, but everyone knows this one already.
- In terms of getting things done, I do find myself to be more a 
lot more productive! Certainly not in terms of time, but 
definitely creative energy. I've picked up and even wrapped up a 
lot of projects from my backlog. I do miss some bugs in review 
(or sometimes am just too lazy to review the code), so the output 
quality is maybe not the same as what I would have churned out by 
hand, but I'll definitely take a 20% quality hit for a 500% 
productivity gain.

On contributing to D:

I think so far I used Claude to write patches for Phobos, 
Druntime, and DMD. In order:

- Phobos: For me these are very easy to review. I'm confident in 
their quality, so it's just a time saver.
- Druntime: These have been mainly translations of C headers to 
D. LLMs are good at these, so the main thing to watch out for is 
that the translation follows our conventions.

And then there's the compiler, DMD.

So, here's the thing. Maybe my perspective is off, but my point 
of view is: in order to understand and be able to meaningfully 
review patches to the compiler, you must be a compiler developer. 
And, you do not simply become a compiler developer.

As much as I wish I could understand and help out with all parts 
of D, I need to pick my battles. In my mind, D compiler hackers 
are the most elite of the elite D developers. I bow to them and 
plead for their mercy as they consider my bug reports and patches.

This puts me in a difficult situation every time I run into a 
blocking compiler bug. I could:

1. Reduce the bug to a test case, file an issue, and watch as 
likely absolutely nothing happens for years (blocker or not, 
regression or not). Understandable, since compiler bugs are hard, 
compiler development is hard, and nothing in life is free.
2. Give up on some of my personal projects and invest into 
becoming a D compiler developer instead.
3. Tuck my tail and try to work around it in my code base, giving 
up on my perfect envisioned design.
4. [NEW!] Ask the bot to draft a patch, which it often ultimately 
succeeds at doing (at least to the point of getting the test 
suite to pass). Now, instead of filing a bug report, I can file a 
bug report with a machine-generated patch attached (in the form 
of a pull request), which might be total garbage - but at least 
it starts a discussion! I'm not sure how the compiler hackers 
feel about this, though, but I've always tried to be up-front 
about the provenance of the patches and so far I have not been 
asked to stop.

What should we do about this? Should we use it more? Should we 
use it less? I don't know. There's definitely valid reasons - 
ethical, practical, legal, financial - to avoid using it, as have 
been mentioned here and elsewhere. But it also seems genuinely 
useful in at least some situations, and no one knows what the 
future holds.

Anyway, one point I wanted to make is that if we're talking about 
the quality/implications/etc. of AI, we should definitely be 
clear about which specific model we're talking about. Things have 
been improving very rapidly and there's a lot of variation in 
what you might experience recently or even today.



More information about the Digitalmars-d mailing list