Self Optimizing Code

Thu Aug 4 11:55:55 PDT 2016

On Thursday, 4 August 2016 at 11:35:41 UTC, crimaniak wrote:
> On Tuesday, 2 August 2016 at 22:06:38 UTC, Mark "J" Twain wrote:
>> Instead, a better solution would be to use variables:
>>
>> if (n*length > m*capacity) expand(l*length)
>
> Some time ago I played with self-optimizing cache layer.
> Problem: Time to obtain cache items is unknown and server 
> dependant. For example, network can be involved in this. 
> Sometimes is it localhost, sometimes server on another 
> continent. Time to cache it is unknown too. For example if 
> memcached extension is not installed on this server then 
> fallback to disk backend will slow cache performance very much. 
> So we don't know is it worst to cache.
> Solution: cache measures own performance and acts accordingly.
>

I did not think of this in terms of networking. I suppose it can 
be used there too but, as you mention, latency could be a factor.

> So I make next things:
> 1. Functional interface instead of save/load type interface. 
> cacheLevel.get(functor_to_get_item) allow to measure item 
> obtaining time.
> 2. Implement all measuring/controlling logic in separate class 
> with interface AbstractStatist, in CacheLevel class make just 
> hooks for it. So changing AbstractStatist implementation I can 
> change work mode to measure statistics and use it or work as 
> usual cache (EmptyStatist for all empty hooks). I make 
> implementation to skip all items not worst caching 
> (calcTime/calcHits*totalHits-totalTime < 0) but it's possible 
> to make more smart things (like caching only most efficient 
> items not exceeding cache size).

If it is more complex, that what I described, it would have to be 
thought out a great deal. My goal was to simply have the program 
expose optimization points(variables) then allow an optimizer to 
change those to find better points. The program itself would be 
virtually unmodified. No code to interact with the optimization 
process except to use variables instead of constants(which is 
minimal and necessary).

Exposing an interface for the program itself to guide the 
optimization process seems like a lot more work. But, of course, 
ultimately is better as it allows more information to flow in to 
the optimization process. But this design is beyond what I'm 
willing to achieve(this way could be months or years to get done 
right), while my method could take just a few hours to code up, 
and is rather general, although a bit dumb(fire and forget and 
hope for the best).

> In my experience I can draw some conclusions.
> 1. It need to separate measure mode and control mode. You can't 
> have accurate statistics while changing system behavior 
> according to current statistics state.
> 2. Statistics can be different for different applications but 
> for specific application in specific conditions for most cases 
> it can be approximated as constant.
>

Yes, it is tricky to make the algorithm stable. This is why I 
think, for a simple optimizer, it would need to do this over long 
periods(months of program use). Because there are so many 
aberrations(other programs, user behavior, etc), these can only 
be statically removed by repetitive uses. Essentially "low pass 
the data to remove all the spikes" then compare the avg result 
with the previous.

> So for array allocating strategy more realistic scenario the 
> next, I think:
> 1. Application compiled in some 'array_debug' mode then some 
> statist trait added to array, collect usage statistics and 
> writes optimal constants at the application exit.
> 2. Programmer configure array allocator in application 
> according to these constants.
> 3. Application builds in release mode with optimal allocation 
> strategy and without any statist overhead and works fast. Users 
> are happy.

This is more static profiling type of optimizations. I am talking 
about something a bit different.  Both methods could be used 
together for a better result, but mine is for simplicity.  We 
generally blindly set constants for things that affect 
performance. Let's simply turn those constants in to variables 
and let a global blind optimizer try to figure out better values 
than what we "blindly" set. There is no guarantee that it would 
find a better result and may even introduce program instability. 
But all this stuff can be somewhat measured by cpu and memory 
usage and given enough parameters, there is probably at least 
several optimal points the optimizer could find.

Ultimately for just a little work(setting the variables and 
specifying their ranges and step, say), we could have most 
programs being created, in D for now at least, optimizing 
themselves(while they are being used by the user after they have 
been shipped) to some degree.  This, I believe, is unheard of. It 
represents the next level in program optimizations.

Imagine one day where a program could optimize itself depending 
on the hardware of the user, the users habits, etc. Well, this 
method attempts to get that ball rolling and does all those 
things in a general way(albeit ignorant, but maybe just as 
effective). It's hard to know how effective until it is done, but 
we do know it can't be any worse than what we already have.