Divide & Conquer divides, but doesn't conquer

Adam D. Ruppe destructionator at gmail.com
Mon May 25 13:13:11 UTC 2020


On Monday, 25 May 2020 at 07:26:52 UTC, Max Samukha wrote:
> State-of-the-art is to preallocate the result string and use 
> the string counter hack))

In this case, it doesn't matter much because the string is 
generated only once per unique length, then it is cached thanks 
to the template arg in the helper enum.

So the CTFE only actually happens once, however you do it this 
way.

The preallocation becomes important when the generated string 
becomes large because it avoids the compiler leaking 
intermediates and wasting a LOT of time realloc and copying again 
- when it is just a few hundred bytes like this it doesn't really 
matter.

When the CTFE expression finishes btw the compiler does free the 
memory pool. So even if you did this little thing over and over, 
it can be OK. But again when the generated string becomes very 
large, the peak memory can be prohibitive (it reaches tens of 
gigabytes of pure waste surprisingly quickly).

BTW on ctfe cache vs recalculate it depends on how much reuse you 
have. In my jni.d, the first version had an inline string replace 
in a loop. It took 30 GB of memory [!!!] and took several minutes 
to compile.

Simply moving that out and pregenerating a `static immutable` so 
CTFE was only invoked once, took it down to 3 GB and 25 seconds. 
Still slow, but huge, huge improvement.

But that's because there were only actually two variants of that 
string and it was fairly large. So caching made sense (static 
immutable btw is a bit better than a helper template holding the 
result - both cache but it is like a single variable vs a hash 
table; in fact that's how it is implemented). But if there were 
hundreds of variants instead of two, the cache itself would get 
very expensive.


More information about the Digitalmars-d mailing list