code databases for ai
monkyyy
crazymonkyyy at gmail.com
Sat Dec 13 23:27:40 UTC 2025
I started the process of extracting all the code from the forums
3 weeks ago:
https://github.com/crazymonkyyy/dlangforums (I know of some flaws
here, ai slop but semi-functional)
adr's style of code of giant files with example programs in
comments needs some amount of processing (qwen doesnt read adr's
files without being explicitly told to, I dont know if any of
them have the "attention" to handle "simple display")
extracting links from dub webpages likely isnt that hard
My own code is a horrible mess, I never got around to actually
cleaning up my repos, when I planned on doing that last year or
the year before, or the year before. To say nothing of my unnamed
gists.
etc.
---
Its a big project to try to collect as much of trusted code into
one organization system, "rag" is a bit of a meme but seeding a
code base with known good code(compared to ai hullinations
anyway) for a degree of taste and something that actually
compiles is a real technique.
(dont any of yall tell me "I told you so" about dub, it still
will require processing)
if anyone else is working on pieces id like to know about it. I
have some thoerys about how to meta program to detect if a struct
is a container, if a function is a range algorithm, if a file is
a program, etc.
Has anyone done anything on this subject? Is anyone interested in
it? It may need a real hosting solution, github has file size
caps that I ran into with just the forums if I start extracting
from dub and then try to host that github may get quite upset.
More information about the Digitalmars-d
mailing list