Walter's talk on D backend

Mon Sep 23 14:39:12 UTC 2024

On Monday, 23 September 2024 at 13:39:46 UTC, claptrap wrote:
> On Monday, 23 September 2024 at 12:31:27 UTC, user1234 wrote:
>>>
>>> What would show SSA is if you cant assign to a register 
>>> multiple times. Eg..
>>>
>>> ```
>>> %1 = i32  load  ptr @X
>>> %2 = i32  load  ptr @Y
>>> %2 = i32  add   i32 %1, i32 %2
>>> ```
>>>
>>> That would be invalid in SSA
>>
>> Yes. That's impossible. Also the reasoning that each SSA 
>> instruction is a state of a register is not totally exact 
>> (that's not yours tho). At the IR stage the notion of register 
>> does not exist *yet*. However it's true that it's often the 
>> case.
>
> Its poor choice of words on my part, my mental model is of SSA 
> IR as like assembly language with infinite "virtual registers". 
> But essentially its instructions that return a constant value.
>
> ```
> %1 = i32  load  ptr @X
> ```
>
> %1 is a constant value. But i find it hard not to think of it 
> as a register, because it's all so assembly like.
>
>
>> Otherwise, my example showed that reasoning at the level of a 
>> variable is not relevant, for DCE at least. Actually an unused 
>> load should never be generated.
>> What *can* happen is something like
>>
>> ```
>> X + 1; // %1 = load ...
>>        // %2 = add
>> ```
>>
>> then DCE will realize that %2 is never used and drop %2 and %1.
>
> Well yeah, when you convert to SSA each new definition of a 
> variable (each time you assign a new value to it) you get a new 
> virtual register, so
>
> ```
> int x = 100 ==> x0 = 100     ==> %0 = 100
> x = 10      ==> x1 = 10      ==> %1 = 10
> x = x*x     ==> x2 = x1*x1   ==> %2 = mul %1,%1
> ```
>
> So x ==> x0,x1,x2 ==> %0,%1,%2
>
> The point is it makes it explicit which definition of X, 
> reaches what use of X, even if you don't actually know that 
> %0,%1,%2 are actual definitions of X.

I see. Actually each access to x is a different load. Remember 
that the front-end has solved what is x, in each case. Here it 
turns out it's the same. (maybe actually what you're looking for 
is how a D VarExp gets handled in the "glue-layer" ?)

Assuming x is an alloca (I let you imagine if it's a non-tls 
global and used in a threaded context, that would be an other 
piece of cake, at least not optimizable as I'll present it...) so

```
int x = 100;
x = 10;
x = x*x;
```

gives more something like

```
store x, 100

store x, 10

%1 = load x    // x as LHS
%2 = load x    // x as RHS
%3 = mul %1 %2 // x * x but as rvalue

store ptr x, %3
```

what an optimizer will see with that list of SSA nodes is, in 3 
steps:

1. first store can be dropped (dead code), giving

```
store x, 10

%1 = load x    // x as LHS
%2 = load x    // x as RHS
%3 = mul %1 %2 // x * x but as rvalue

store ptr x, %3
```

2. %1 and %2 are consecutive common sub expressions, with no side 
effects, giving

```
store x, 10

%1 = load x    // x as LHS and RHS
%2 = mul %1 %1 // x * x but as rvalue
      store ptr x, %2
```

3. x is not stored between the load and the mul, ending up with

```
store ptr x, 100 // 10 * 10 with constant folding
```

> But that isn't how Walters backend works as far as I can tell 
> from his talk, since multiple elem can point to the same 
> variable. If you have an 3 "elem" that store to variable "X", 
> and one use of "X", there's no way to tell if any of those 
> stores are redundant without actually working out how they all 
> depend on each other.

Sorry, I knewn this had the potential to go off topic. It's 
really just the word "variable" that has initially triggered me. 
Let's call this "SSA node".