DMD Performance Regression Publisher [GSoC 2026]

Abul Hossain Khan abulkhan19175 at gmail.com
Fri Jun 12 18:58:03 UTC 2026


Hi everyone,
I am working on the Performance Regression Publisher project 
under the mentorship of Dennis.

My progress so far: the initial end-to-end pipeline has been 
built and is working on my fork. The bot builds DMD at a PR's 
merge-base and at its head, measures a small set of metrics under 
cachegrind, and posts a single sticky comment with the diff.

**What's done**

The harness is in `tools/perfrunner/` and is written in D (dub 
project),

- `app.d` — CLI, takes the two already-built dmd binaries + 
metadata and writes `results.json`.
- `cachegrind.d` — runs the compile under cachegrind and reads 
the instruction count.
- `metrics.d` — the five metrics below.
- `report.d` / `stats.d` — the schema-v1 JSON and the 
percent-delta math.
- `workloads/hello.d` — the single workload for now will add more 
soon.

Around it, `.github/workflows/perf.yml` runs on every PR (and on 
pushes to master), builds both refs with the 
existing `ci/run.sh`, runs the harness, and hands the result 
to `.github/scripts/perf_comment.py`, which upserts one sticky 
comment so force-pushes don't spam the thread.

**The metrics it reports(PR Comment) right now Looks like this:**

| Metric | Base | PR | delta |
|--------|------|----|-------|
| compile hello.d (instr) | 422.9 M | 457.9 M | +8.27% |
| compile hello.d -O (instr) | 446.1 M | 481.1 M | +7.84% |
| dmd binary size (stripped) | 11.91 MB | 11.91 MB | 0.00% |
| hello binary size | 0.72 MB | 0.72 MB | 0.00% |
| peak RSS (compile hello.d) | 56 MB | 55 MB | -2.17% |


I tested the whole path end to end: On my fork with a deliberate 
busy loop in `compiler/src/main.d`, and the bot reported +8.27% / 
+7.84% instructions consistently across reruns while size stayed 
flat.
Code - https://github.com/abulgit/dmd/pull/39

**what's next -**

1. We are currently building both with DMD 2.112.0 as the host 
compiler. Dennis suggested moving to `ldc2 -O3` with PGO so the 
binary we measure matches a real release build and the optimizer 
doesn't make a harmless PR look like a regression. Also he 
suggested that we should to do this early, in case valgrind has 
any trouble with ldc2.
2. After that, We will add more Realistic Workloads there like 
Phobos etc.
3. And then Building the dashboard that will show the historical 
Data.

That's the plan we have right now, and I'll try to post weekly 
updates here as work progresses.
Feel free to leave any feedback or suggestions!


More information about the Digitalmars-d mailing list