How can we make it easier to experiment with the compiler?

Mon May 24 09:02:41 UTC 2021

On Monday, 24 May 2021 at 02:25:33 UTC, Nicholas Wilson wrote:
> On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad 
> wrote:
>> The number one challenge I see is keeping track of DMD as it 
>> is released with new improvements. Basically reapplying the 
>> changes made to the experimental branch to the main branch 
>> (aka "rebasing"?).
>
> (the is the correct terminology). I suspect this is more of a 
> problem for people that are less familiar with git, which might 
> well also include people wanting to play around with DMD, e.g. 
> GSoC/SAoC students.
> I know this was the case for me while developing dcompute with 
> the added difficulty of tracking LLVM on top of LDC (which was 
> kept in sync with DMD).
>
>> I suspect that kills many efforts, meaning people create a 
>> fork, start making changes, but then a new version of DMD is 
>> released and the fork is left to dry in the sun as rebasing is 
>> not fun. And well, a hobby that isn't fun, is not a good 
>> hobby. :-D
>
> The solution to this is better git skills not so much better 
> compiler skills/knowledge of DMD although a merge conflict in a 
> critical piece of code is always a PiTA. We now have 
> slack/discord for people to ask these kinds of questions, which 
> I'm sure they will get answered if the are trying to do 
> something interesting or fix an annoying problem.

I think I should have used the term "boring" rather than 
"challenging".

I doubt that git skills would solve it as I think it is more 
related to what a hobby is to people who are older and have a 
very long spare time todo-list. Any "unproductive" and "unfun" 
chore will go to the bottom of the todo-list. My 
I-really-ought-todo-list is so long that it could fill up the 
rest of my life...

So it is basically easier to just stay on an outdated dmd-branch 
for a couple of years, rather than keeping track of it... which 
is not a good strategy.

Think of it like this: I have 2-5 hours a week for completely 
unnecessary, but fun things like hacking a new IR + optimization 
inbetween DMD and LLVM. So, what should I do: do my taxes, rebase 
my fork, watch Eurovision with family? Rebasing is down there 
with taxes, except I have to do the taxes eventually, just not 
this Saturday... (Ok, so we watch Eurovision then just to find 
out how bad it is? :-)

I think it would not be too difficult to get to a situation where 
you have well-defined entry points, hooks, layers that makes it 
more of a plugin-experience.

Examples of potential plug-and-play:

1. Add new experimental syntax: The parser is quite close. It 
would not take a lot of work to encapsulate a manager  of 
(file-extension, Parser) pairs that have no overhead (compile 
time). Ok, so if you want to extend the language as experiment, 
just duplicate the parser, modify it and plug it in. This is a 
low-hanging fruit.

2. Add new semantics: add a new file with functions with custom 
intrinsics that are somehow added to the runtime, use your custom 
parser to lower your custom syntax to these custom runtime 
functions. Inject yourself between the front-end and backed 
(assuming a high level IR), pick up the custom intrinsics and do 
the analysis/transforms you want.

3. Add new high level optimization, like ARC: same as 2, except 
you only add new passes in a new file and possibly some new 
fields to the high level IR. Then edit a config file that makes 
the pass available and executed at the right time (with respect 
to other passes).

So, the basic idea is, that instead of _modifying_ the compiler, 
you add new files to it and bring them into the compiler by 
hooks, configuration files etc.

Then you can also much easier merge and combine contributions 
from many different extension authors and easily replace one 
extension with a better one.

> Urgh. Dealing with 10000 line files and 1000 line functions is 
> such a drain on trying to get stuff done (looking at you 
> expressionsem.d). However this needs to be combined with 
> directories/packages or it will not improve the situation.

Yes, but one can create virtual directories though. E.g. in some 
editors you can group files from different directories so it 
looks like they are in one directory. You can do something 
similar with "ln -s", but it isn't optimal...

>> Which items are feasible in the next 6 months?
>
> Directories.

Sounds like a good start. I still think the high level IR is the 
most pressing one, as not having that abstraction makes adding 
new experimental semantics too time consuming for hobbyists.

I had the idea that I could do ARC by adding intrinsics to LLVM, 
but Apple engineers strongly advised against it and strongly 
suggested working on a high level IR instead.

ARC is something well suited for a hobbyists as you can implement 
it in a gradual manner if you have a high level IR (one tweak 
here, one tweak there).

Anyway, I think more experimentation is needed. Say, if 1 out of 
10 experiments made it into the main dmd, then there could be 
more interesting options that would make dmd stand out in the 
crowd.

IMHO The key challenge is to make experimentation fun for people 
who have limited time (which happens as you get older).

Imagine if D could get some of the people that were active with D 
10-15 years ago, but currently have very limited time, to create 
their own experiments? I am sure that many of those have grown to 
capable programmers since then, so that could be something to 
think about.

It has to be fun experience throughout for people to spend those 
3-4 spare hours a week on compiler hacking.