Self Optimizing Code

Wed Aug 3 20:08:57 PDT 2016

On Thursday, 4 August 2016 at 01:54:57 UTC, Chris Wright wrote:
> On Wed, 03 Aug 2016 21:09:46 +0000, ikod wrote:
>> This is not just profiling, but "Profile-Guided Optimization 
>> (PGO)"
>
> The person is talking about algorithms with tuning parameters 
> (like array growth rate when appending) adjusting those 
> parameters based on observed characteristics of the program. 
> For instance, if it's observed that arrays of a given type tend 
> to grow to ~1800 entries or stay under 100 entries, the runtime 
> might automatically extend your array from length 128 to 2048 
> directly instead of growing to 256, then 512, then 1024, and 
> finally 2048.
>
> A compiler would not be allowed to make these changes 
> automatically.
>
> There are significant downsides. You need a lot of bookkeeping 
> to determine what values you need. The values are not preserved 
> across runs. You need to implement it separately for each 
> algorithm.
>
> You generally only bother when the potential performance gains 
> are great as an absolute measure and relative to the amount of 
> bookkeeping you would have to do. For instance, with array 
> appending, it's a terribly common operation, so you can only 
> use this technique productively if gathering the data is dirt 
> cheap.

Close. I am only talking about adding the features to do such 
things and not having the compiler involved in the analysis. We 
can already do this sort of thing but it requires more 
boilerplate code and and the benefit may be negligible.

What it does for us is allows us to change hard coded values into 
special variables that then can be manipulated(if desired, or 
left alone) while the program is executing. All the compiler does 
is take these special variables and stores them in chunks(tuples) 
and possibly stores some discriminatory info about them like the 
file and function they were used in. Then it is up to the 
optimizing algorithm do decide how to approach using them. If it 
does nothing, then the only performance hit is that variables 
were used instead of hard coded literals.

A bit more info may be required for the optimizer to make good 
choices though. Since the data collected could be extremely 
large(thousands of tuples). This creates a very large parameter 
space and knowing what parameter changes created what performance 
changes is crucial(although a blind sort of parameter space 
montecarlo method could work, but would produce erratic behavior).

The optimizations would persist between program executions simply 
by writing out all the current parameter values, else it would be 
difficult for the program to ever actually optimize itself. 
Neural networks could be used to find more optimized conditions. 
It could be extended to include more dynamic characteristics of a 
program, like what JIT does, but I think this adds far more 
complexity and would require much more compiler support and would 
then probably just end up with a JIT like environment.

Another simple example. Suppose one has a sleep(10) in a thread. 
The 10 was "arbitrarily" chosen. Instead, if we could 
dosleep($10), 10 becomes a parameter. The compiler emits it to a 
file along with it's address(it is a global in the program 
address space). The optimizer then reads the file, attempts to 
modify this value with a slight perturbation, say to 11. Checks 
the performance impact(over many uses, to remove statistical 
deviations) and if it is more performant, it then uses that value 
as the default. It can then try 12 and repeat the process.  While 
the algorithm is not advanced, it is produces a more optimal 
result than hard coding the value to 10, which then cannot change 
for the life of the program. To keep the program stable, such 
changes would have to occur very slowly. Better algorithms, and 
better profiling tools(which, say, could do wider sampling and 
better performance analysis), could hone in on good values from 
the get go. Then the program itself adjusts to the individual 
user case by analysis.