Property rewriting; I feel it's important. Is there still time?
Chad J
chadjoan at __spam.is.bad__gmail.com
Wed Mar 10 19:14:21 PST 2010
Andrei Alexandrescu wrote:
> On 03/09/2010 09:48 PM, Chad J wrote:
>> I speak of the property rewriting where an expression like
>>
>> foo.prop++;
>>
>> is rewritten as
>>
>> auto t = foo.prop();
>> t++;
>> foo.prop(t);
>
> This particular example has a number of issues. First off you need to
> rewrite expressions, not statements. Consider:
>
> auto x = foo.prop++;
>
> You'd need to assign to x the old value of foo.prop. So one correct
> rewrite is
>
> foo.prop++
>
> into
>
> {auto t = foo.prop; auto t1 = t; ++t1; foo.prop = t1; return t;}()
>
> within an rvalue context, and into:
>
> {auto t = foo.prop; ++t; foo.prop = t; return t;}()
>
> within a void context.
>
> I'm pointing out that things may not always be very simple, but
> generally it's easy to figure out the proper rewrites if attention is
> given to detail.
>
Right. This one made itself easy to notice because if you either return
a value in a void context (ex: expression statements) or fail to return
in a non-void context (ex: conditions for if/for/while statements and
the like) then further execution of semantic analysis will error.
What I end up doing is generating a bunch of comma expressions that hold
the rewritten property expression. So my rewrite for "auto x =
foo.prop++;" actually looks like this:
auto x = (auto t = foo.prop, (auto t1 = t++, (foo.prop = t, t1)));
It's illegal D code, but only because of a check (in
Expression->semantic() somewhere IIRC) that prevents declaration
expressions from appearing in arbitrary places. Once you're past that
check you can put them there and the backend knows what to do with them.
I stick t1 in there at the end to make the comma expression evaluate to
the value of t1 at the end of the calculations. If it's a void context,
I don't stick t1 in there at the end, because if I did then it would
complain about having no side-effects.
>> So, Walter or Andrei or someone on the planning behind the scenes,
>> please lend me your thoughts:
>> How much time is left to make this sort of thing happen?
>> If a working patch for this showed up, would it have a reasonable chance
>> of acceptance at this point?
>
> The idea is sensible and is already in effect for the ".length" property
> of arrays.
>
>> I really want to make this happen, even if I have to pay someone to do
>> it or finish it off. It's very close but I have almost nil free time
>> for this stuff.
>>
>> Note that I have made it work and seen it in action. There'd be a patch
>> two months ago if I hadn't decided to rebel against the way DMD did
>> things*.
>
> Probably offering payment wouldn't be much of an enticement, but
> lobbying reasonable ideas here is a good way to go.
>
I figure it might give an edge of motivation, especially to some of the
talented college students around here. I would have probably done this
kind of thing in college if the opportunity had popped up. I think
Spring break is about here too.
>> ...
>
>> - Having property rewrites allows the special case for "array.length +=
>> foo;" to be removed. Property rewriting is the more general solution
>> that will work for all properties and in arbitrary expressions.
>
> Agreed. By the way, I'm a huge fan of lowering; I think they are great
> for defining semantics in a principled way without a large language
> core. In recent times Walter has increasingly relied on lowerings and
> mentioned to me that the code savings in the compiler have been
> considerable.
>
Interesting.
>> - By treating opIndex and opIndexAssign as properties then that pair
>> alone will make cases like "a[i]++;" work correctly without the need for
>> opIndexUnary overloads. Also "a[i] += foo" will work too, as well as
>> anything else you haven't thought of yet.
>
> Well operator overloading handles indexing differently, and arguably
> better than in your proposal. Ideally we'd define operators on
> properties in a manner similar to the way indexing works in the new
> operator overloading scheme. I'll talk to Walter about that.
>
>
> Andrei
I wouldn't want to have to define functions for side-effectful operators
/in addition/ to the getter and setter. The opIndexUnary/
opIndexOpAssign things have bugged me a bit because I've felt that the
value returned from opIndex should handle its own operator overloads. I
wonder if we are talking about two different things.
The extra opIndexUnary/opIndexOpAssign overloads could supersede the
behavior of getting from opIndex, mutating a temporary, and calling
opIndexAssign with the temporary. I'd still like to not /need/ to
define the extra operator overloads though.
Indexing seems to be the general case of properties: an indexed
expression can be a getter/setter pair identified by both an identifier
(the property's name: opIndex in this case) and some runtime variables
(the indices). The properties are a getter/setter pair identified by
only the property's name alone. This isn't much harder to deal with:
foo[i]++;
->
{auto t = foo.opIndex(i);
t++;
foo.opIndex(i,t) }()
Now if the index itself has side effects, then that expression must be
removed:
foo[i++]++;
->
{auto t = foo.opIndex(i);
t++;
foo.opIndexAssign(i,t)
i++; }() // i++ is removed from the indexing expression.
I think I've managed to successfully deal with that.
I've also given thought to the notion of side-effects within
side-effects, and I make sure those are safely removed so that things
don't get executed twice or more in an unexpected manner.
And... I also handled out and ref parameters in function calls. A
property found used as a ref argument is extracted from the call and
replaced with a temporary that is get and set. I feel that out
parameters are similar to assignment, so a property found as an out
argument will only have its setter called.
I just need to get the blasted thing to mesh with dmd's manner of
travelling the AST ;)
More information about the Digitalmars-d
mailing list