How do I find the arity of an Expression? (dmd hacking)

Tue Dec 1 12:35:00 PST 2009

Ellery Newcomer wrote:
> On 11/30/2009 07:59 PM, Chad J wrote:
>>
>> This is about the property expression rewrite of course.  I'd love to
>> just use the current convention in dmd and write the rewrite as a
>> non-recursive function that gets called at every point in the tree
>> whenever someExpr->semantic(sc) is called.  However, there's a snag.
>>
> 
> Personally, I've come to hate this pattern (just from reading DMD src).
> It seems like the antithesis of code reuse.
> 

Yeah, I can see that.  There seems to be a lot of duplicated patterns
and things done by convention.  That makes it a little more difficult to
figure out what is going on.  (It's done this way... but *why*?)

I just don't want to alienate Walter from my patch by defying the style.
   It may be inevitable in a minor way though.

>> When doing the property rewrite, the inner-most subexpression will need
>> to generate the outermost temporaries.
>>
>> An example:
>>
>> prop1.prop2 += foo;
>>
>> AFAIK, the compiler sees this as ((prop1).prop2) += foo; or
>> ((prop1()).prop2()) += foo; after resolveProperties is called.  So prop1
>> is the innermost expression.
>>
>> It would be rewritten as this comma expression:
>>
>> (auto t1 = prop1()),
>> (auto t2 = t1.prop2()),
>> (t2 += foo),
>> (t1.prop2(t2)),
>> (prop1(t1))
> 
> This is an interesting problem. I like it. Is this rewrite condoned by
> the powers that be? It'd be fun to implement if I ever get this far in
> my semantic analyzer.
> 

No guarantees, but a lot of promise.

http://erdani.com/d/thermopylae.pdf
On page 114 of the draft, 14 of the pdf, in section 4.1.10, at the
bottom: notice how Andrei seems to be hedging on properties working
correctly.

Now I can envision that behavior with array length being hacked in for
special cases.  That would make the book example work but lack
generality since it wouldn't work for arbitrary expressions and maybe
not user-defined properties either.

Walter has already swung for explicit properties that require syntax
addition to the language (and annotations at the same time!) :
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce&article_id=16862

I think if he'll go for explicit properties, he'll want them to work
correctly too.  Think about it... even if the compiler knows that the
appearance of some variable in an expression is actually a property, how
is it going to figure out whether or not the setter should be called and
what it's going to pass to the setter?

I conjecture that this property expression rewrite is quite necessary
for properties to work unsurprisingly in arbitrary side-effectful
expressions.  Whether the properties are implicit or explicit is
unrelated, and affects other things instead.

So Walter is going to have to accept this or he will be in for some
nasty surprises when he tries to make his explicit properties work.

(Of course he could always decide that properties aren't worth it.  That
would be unfortunate.)

http://prowiki.org/wiki4d/wiki.cgi?DocComments/Property
For more info, if you haven't seen it already.

>>
>> So t1 is the outermost temporary, and it corresponds to prop1, the
>> innermost expression.
>>
>> This is problematic in dmd's traditional approach, because the semantic
>> pass on prop1 does not have access to it's enclosing expression (as far
>> as I can tell).  If it did, I'd just go to the root of the tree, uproot
>> the tree, stick it in a comma expression, and just start nesting:
> 
> I'm not convinced you need access to the enclosing expression to make
> this rewrite happen. Working it out on paper, it appears to be a matter
> of a few AST copies, keeping track of your t's, and building your comma
> tree (in two directions simultaneously) as you traverse the AST via eg
> semantic.
> 
> Here's some chicken scratch that more or less illustrates what I
> envision going on (I can't attest that DMD's ASTs look exactly like
> this, but I assume they're similar):
> 
> http://personal.utulsa.edu/~ellery-newcomer/scribble.png
> 
> How much effort it would entail is another matter though. I would assume
> you could just add a field or two to struct Expression for pushing
> information up the tree. It seems like copying functionality is already
> there. Pushing 't' information across the tree would probably propagate
> to each special case and be the most annoying part. But basically, every
> time you come across a property, you generate a new symbol and pair of
> pre/post trees. Then at the top, you make a single tree copy with the
> last symbol generated.
> 

Yeah.  I think you're right.

>>
>> So yeah, if I haven't put you to sleep already, please let me know if I
>> messed up somewhere.
> 
> I do have one nitpick:
> 
> auto t1 = p1();
> 
> isn't an expression, it's a declaration. That could potentially make
> things difficult for you, particularly when your property assignment is
> deep within an expression tree.
> 

Oh my.

I just tried out a comma expression with declarations in some D code and
dmd didn't like it at all.  In hindsight I should've checked that a long
time ago.

I did this though because I saw it happening in other parts of
expression.c.  Maybe I missed some details.  Or maybe the backend will
be fine with stuff like this.  I wonder.

Yeah, if I can't put declaration expressions into other expressions,
then things may get ugly.  I'd probably have to put more code into
statement.c at least.

Eh, I'll know how things work out soon enough.