Easy & huge GC optimizations
John Colvin via Digitalmars-d
digitalmars-d at puremagic.com
Fri May 23 08:41:38 PDT 2014
On Friday, 23 May 2014 at 13:43:53 UTC, Chris wrote:
> On Friday, 23 May 2014 at 06:17:43 UTC, Rainer Schuetze wrote:
>>
>>
>> On 22.05.2014 21:04, Etienne wrote:
>>> On 2014-05-22 2:12 PM, Rainer Schuetze wrote:
>>>>
>>>> "NO_INTERIOR" is currently only used for the hash array used
>>>> by
>>>> associative arrays. It is a bit dangerous to use as any
>>>> pointer,slice or
>>>> register still operating on the array is ignored, so
>>>> collecting it might
>>>> corrupt your memory.
>>>
>>> That's quite a relief, I was afraid of having to do it ;)
>>>
>>> I'm currently exploring the possibility of sampling the
>>> pointers during
>>> mark'ing to check if they're gone and using bayesian
>>> probabilities to
>>> decide whether or not to skip the pool.
>>>
>>> I explained it all here:
>>> https://github.com/D-Programming-Language/druntime/pull/797#issuecomment-43896016
>>>
>>>
>>> -- paste --
>>> Basically, when marking, you take 1 in X of the references
>>> and send them
>>> to a specific array that represents the pool they refer to.
>>> Then, next
>>> time you're going to collect you test them individually and
>>> if they're
>>> mostly there, you skip marking/free'ing for that particular
>>> pool during
>>> collection. You can force collection on certain pools every 1
>>> in X
>>> collections to even out the average lifetime of the
>>> references.
>>>
>>> You're going to want to have a lower certainty of failure for
>>> big
>>> allocations, but basically you're using probabilities to
>>> avoid pushing a
>>> lot of useless load on the processor, especially when you're
>>> in a part
>>> of an application that's just allocating a lot (sampling will
>>> determine
>>> that the software is not in a state of data removal).
>>>
>>> http://en.wikipedia.org/wiki/Bayes_factor
>>>
>>> -- end paste --
>>>
>>> The bayes factor is merely there to choose the appropriate
>>> model that
>>> fits with the program. Bayesian inference would take care of
>>> deciding if
>>> a pool should end up being mark'ed. In other words, machine
>>> learning.
>>>
>>> Would you think it'd be a good optimization opportunity?
>>
>> Hmm, I guess I don't get the idea. You cannot skip a pool
>> based on some statistics, you might have references in there
>> to anything. As a result you cannot collect anything.
>
> I'm not a fan of machine learning, especially not in cases you
> _can_ control, like memory allocation / deallocation. Guessing
> is not a good strategy, if you can have control over something.
> Machine learning is only good for vast and unpredictable data
> (voice recognition for example). Then it makes sense to apply
> probability. But if you can actually control what you are
> doing, why would you want to rely on a stupid and blind machine
> that decides things for you based on a probability of n%?
> Things can go wrong and you don't even know why. Mind you, we
> should rule the machines, not the other way around.
Bear in mind here that most code goes though a whole bunch of
machine learning algorithms in the CPU itself. Like it or not, it
has proved extremely successful.
More information about the Digitalmars-d
mailing list