A safer File.readln

Shachar Shemesh via Digitalmars-d digitalmars-d at puremagic.com
Mon Jan 23 02:44:50 PST 2017


On 23/01/17 11:15, Markus Laker wrote:

> A 2GiB disk file caused tinycat.d to use over 4GiB of memory.
>

When extending arrays, a common approach is to double the size every 
time you run out of space. This guarantees an amortized O(1) cost of append.

Unfortunately, this also guarantees that we will never have enough space 
freed by previous copies to reuse existing memory:

100 byte array

increase

100 bytes free
200 bytes array

increase

300 free
400 array

etc. The array will always be bigger than the total amount of space we 
freed.

If, instead of increasing its size by 100%, we increase it by a smaller 
percentage of its previous size, we still maintain the amortized O(1) 
cost (with a multiplier that might be a little higher, but see the trade 
off). On the other hand, we can now reuse memory. Let's say we increase 
by 50% each time:

100 array

increase

100 free
150 array

increase

250 free
225 array

increase

475 free
338 array

813 free
507 array

The next increase will require 761 bytes, but we already have 813 free, 
so we can allocate the new array over the memory already freed from 
before, reducing the heap size.

Of course, if we integrate the allocating and the move, we could have 
reused previously used memory starting from allocation 3, but I'm 
assuming that would also be possible when increasing by 100%, so I'm 
assuming we can't do that.

Of course, if, instead of 50% we increase by less (say, 20%), we could 
reuse previously used memory even sooner.

I am assuming that this is the problem that manifests itself in this use 
scenario. I would suggest solving it at the language level, rather than 
the library level.

Shachar


More information about the Digitalmars-d mailing list