Missed optimisation case - internal use of STCin

Artur Skawina via Digitalmars-d digitalmars-d at puremagic.com
Sat Apr 19 09:10:48 PDT 2014


On 04/19/14 16:21, Iain Buclaw via Digitalmars-d wrote:
> On 19 April 2014 14:33, Artur Skawina via Digitalmars-d
> <digitalmars-d at puremagic.com> wrote:
>> On 04/19/14 14:37, Iain Buclaw via Digitalmars-d wrote:
>>> On 19 April 2014 13:02, Artur Skawina via Digitalmars-d
>>> <digitalmars-d at puremagic.com> wrote:
>>>> On 04/19/14 13:03, Iain Buclaw via Digitalmars-d wrote:
>>>>> On Saturday, 19 April 2014 at 10:49:22 UTC, Iain Buclaw wrote:
>>>>>> I'm currently testing out a GCC optimisation that allows you to set call argument flags.  The current assumptions being:
>>>>>>
>>>>>> in parameters  =>  Assume no escape, no clobber (read-only).
>>>>>> ref parameters, classes and pointers  =>  Assume worst case.
>>>>>> default  =>  Assume no escape.
>>>>>
>>>>> That should read:
>>>>>
>>>>> ref parameters, inout parameters, classes and pointers.
>>>>>
>>>>> The default of assuming no escape is an experiment - I may limit this to only scalar types, and parameters marked as 'scope'  (So long as no one plans on deprecating it soon :)
>>>>
>>>> What does "assume no escape" actually mean?
>>>> [The above list doesn't really make sense. W/o context, it's
>>>> hard to even tell why, hence the question.]
>>>>
>>>
>>> Actually, I might change the default to assume worst case.  I've just
>>> tried this out, which is still valid.
>>>
>>> class C {
>>>    int * p;
>>>    this(int x) {
>>>      p = &x; // escapes the address of the parameter.
>>>    }
>>> }
>>
>> This might be currently accepted, but it is clearly invalid
>> (escapes local; the only way to make it work safely would
>> be to silently copy 'x' to the GC-managed heap, which would be
>> way too costly).
>>
>>
>>    A f(A a) { g(&a); return a; } // likewise with ref instead of pointer.
>>
>> This is OK (even if ideally 'g' should be forbidden from escaping 'a').
>>
>> Similarly:
>>
>>    A f(A a) {
>>       auto o = register(&a); // can modify 'a'
>>       o.blah();              // ditto
>>       doneWith(o);           // ditto
>>       return a;
>>    }
>>
>>
>> What I was wondering was things like whether that "assume no escape"
>> property was transitive; if /locally/ escaping was disallowed, and
>> to what extent. What does "assume no escape" mean at all? In your
>> examples you're mentioning refs together with pointers, that would
>> only make sense if no-escape were transitive -- but then treating all
>> args as no-escape would be very wrong.
>>
>>
>>> Worse, scope doesn't error on the general case either.
>>>
>>> class D {
>>>    int * p;
>>>    this(scope int x) {
>>>      p = &x; // escapes the address of the scope parameter.
>>>    }
>>> }
>>
>> D's "scope" isn't enforced in any way right now, which means
>> that code could exist that is invalid, but currently works. It
>> would break silently(!) when compiled with a decent compiler,
>> which still doesn't enforce scope.
>>
> 
> People should get bug fixing soon then.  =)

Until some kind of diagnostics appear, most of those bugs won't
even be found. It's too easy to write "auto f (in A a)" and then
forget about the implicit 'scope' when modifying the function body.


>>> Do these examples give you a good example?
>>
>> I'm worried about a) invalid assumptions making it into GDC;
>> b) certain valid assumptions making into GDC. The latter because
>> it could mean that code that's incorrect, but still accepted by
>> the other compilers could silently break when compiled with GDC.
>>
> 
> Invalid assumptions rarely make it into GDC.  The testsuite is a good

AFAICT what you're proposing *is* invalid. I can't be sure because
it's not clear what that "no-escape" property means; that's why I
asked about it twice already...
Clearly, escaping objects reachable indirectly via function arguments
is perfectly fine (eg string slicing), yet you wanted to treat args as
no-escape by default.

Also, treating /some/ types specially wouldn't be ideal;

   struct A { int a; /* no pointers or classes */ }
   struct B { int* b; /*...*/ }

   f(A); // should be treated similarly to 'f(int)'
   f(B); // should be treated similarly to 'f(int*)'

Yes, not doing it is "just" a missed optimization, but in practice
it means that wrapping types becomes even more expensive in D (it's
already almost prohibitively so - eg returning small one-element
structs from functions needlessly happens by ref). 

> bench for this, as well as several projects (now I've got dub set-up)
> to test it in the wild.

These problems will result in invalid optimizations, so can be hard
to trigger and may come and go away randomly.


> Saying that, we have had to revert some optimisation cases as D's
> schizophrenic nature of enforcing attributes and behaviours is
> becoming increasingly dismal.
> 
> eg:
> - nothrow has *no* guarantee, period, because it still allows
> unrecoverable errors being thrown, and allows people to catch said
> unrecoverable errors.
> - pure is a tough nut to crack also.  The theory should allow you to
> be able to cache return values, but in practise...

D's "pure" doesn't have much in common with the "normal" pure concept;
exposing gcc's pure via an attribute and completely ignoring D's
version is probably the only practical solution, anything else would
be too costly, result in too small gains, and be too hard to get right. 

> - The nature of debug statements breaks guarantees of both nothrow and
> pure, possibly many more.
> - Defining reliable strict aliasing rules, it turns out, is not that
> simple (this is something that Walter has mentioned about D should
> have good guarantees for, ie: D arrays).

Short term, disabling strict aliasing is the only option. I was scared
of the impact it had before you even started to add support for it a
few years ago (the codegen was already different w/ -fno-strict-aliasing
before, which meant that I immediately had to disable it everywhere...)
There probably does not exist a single D program that respects strict
aliasing rules, other than by chance. The perf gains are minimal globally,
but the potential for silent data corruption is huge.

> I'm just in investigating all avenues, as I usually do.  There is no
> reason why 'in' shouldn't have more powerful guarantees IMO, and what

Of course "scope" (which is part of "in) should be taken advantage of.
I'm concerned about args not marked as "scope" being treated wrongly.

[The scope-related bugs in D programs is a /language/ or frontend problem,
 not a reason to avoid doing the right thing in GDC. It's just that in
 practice what can happen is that GDC will be seen as miscompiling
 "working" D programs...]

artur


More information about the Digitalmars-d mailing list