No CTFE of function
Cecil Ward via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sun Aug 27 10:36:54 PDT 2017
On Sunday, 27 August 2017 at 00:20:47 UTC, ag0aep6g wrote:
> On 08/27/2017 01:53 AM, Cecil Ward wrote:
>> On Saturday, 26 August 2017 at 23:49:30 UTC, Cecil Ward wrote:
> [...]
>>> I think I understand, but I'm not sure. I should have
>>> explained properly. I suspect what I should have said was
>>> that I was expecting an _optimisation_ and I didn't see it. I
>>> thought that a specific instance of a call to my pure
>>> function that has all compile-time-known arguments would just
>>> produce generated code that returned an explicit constant
>>> that is worked out by CTFE calculation, replacing the actual
>>> code for the general function entirely. So for example
>>>
>>> auto foo() { return bar( 2, 3 ); }
>>>
>>> (where bar is strongly pure and completely CTFE-able) should
>>> have been replaced by generated x64 code looking exactly
>>> literally like
>>> auto foo() { return 5; }
>>> expect that the returned result would be a fixed-length
>>> literal array of 32-but numbers in my case (no dynamic arrays
>>> anywhere, these I believe potentially involve RTL calls and
>>> the allocator internally).
>>
>> I was expecting this optimisation to 'return literal constant
>> only' because I have seen it before in other cases with GDC.
>> Obviously generating a call that involves running the
>> algorithm at runtime is a performance disaster when it
>> certainly could have all been thrown away in the particular
>> case in point and been replaced by a return of a precomputed
>> value with zero runtime cost. So this is actually an issue
>> with specific compilers, but I was wondering if I have missed
>> anything about any D general rules that make CTFE evaluation
>> practically impossible?
>
> I don't know what might prevent the optimization.
>
> You can force (actual) CTFE with an enum or static variable.
> Then you don't have to rely on the optimizer. And the compiler
> will reject the code if you try something that can't be done at
> compile time.
>
> Example:
> ----
> auto foo() { enum r = bar(2, 3); return r; }
> ----
>
> Please don't use the term "CTFE" for the optimization. The two
> are related, of course. The optimizer may literally evaluate
> functions at compile time. But I think we better reserve the
> acronym "CTFE" for the guaranteed/forced kind of
> precomputation, to avoid confusion.
Static had already been tried. Failed. Thanks to your tip, I
tried enum next. Failed as well, wouldn't compile with GDC.
I tried LDC, which did the right thing in all cases. Optimised
correctly in every use case to not compute in the generated code,
just return the literal compile-time calculated result array by
writing a load of immediate values straight to the destination.
Hurrah for LDC.
Then tried DMD via web-based edit/compile feature at dlang.org
website. Refused to compile in the enum case and actually told me
why, in a very very cryptic way. I worked out that it has a
problem internally (this is a now an assignment into an enum, so
I have permission to use the term CTFE now) in that it refuses to
do CTFE if any variable is declared using an =void initialiser to
stop the wasteful huge pre-fill with zeros which could take half
an hour on a large object with slow memory and for all I know
play havoc with the cache. So simply deleting the = void fixed
the problem with DMD.
So that's it. There are unknown random internal factors that
prevent CTFE or CTFE-type optimisation.
I had wondered if pointers might present a problem. The function
in question originally was specced something like
pure nothrow @nogc @safe
void pure_compute( result_t * p_result, in input_t x )
and just as a test, I tried changing it to
result_t pure_compute( in input_t x )
instead. I don't think it makes any difference though. I
discovered the DMD -void thing at that point so this was not
checked out properly.
Your enum tip was very helpful.
Ps
GDC errors: Another thing that has wasted a load of time is that
GDC signals errors on lines where there is a function call that
is fine, yet the only problem is in the body of the function that
is _being_ called itself, and fixing the function makes the
phantom error at the call-site go away. This nasty behaviour has
you looking for errors at and before the call-site, or thinking
you have the spec of the call args wrong or incorrect types.
[Compiler-Explorer problem : I am perhaps blaming GDC unfairly,
because I have only ever used it through the telescope that is
d.godbolt.org and I am assuming that reports errors on the
correct source lines. It doesn't show error message text tho,
which is a nightmare, but nothing to do with the compiler
obviously.]
More information about the Digitalmars-d-learn
mailing list