Making regex replace CTFE by removing malloc

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Fri Apr 10 14:43:49 PDT 2015


On 07-Apr-2015 02:48, Pierre Krafft wrote:
> I got some help from the IRC, poked the code a bit and have some details
> I can share. The code uses a reference counted memory pool, that's the
> part I linked. It's possible to avoid the malloc but that's not the hard
> part. For me it looks like memory pools aren't compatible with CTFE.
> This is because CTFE disallows most pointer casts.

Just noticed this thread, I've been detached from D lately. Cool idea, 
but it wasn't considered in its design, so memchr/malloc and other nice 
optimized C primitives are all around the engine sources.

Indeed there is a fair amount of custom memory allocation.
One engine uses a freelist and the other (+ compile-time generated one) 
uses segmented stack.

On top of that both engines use adhoc ref-counting for internal data 
structures. Initially I delayed memory allocation problem until we get 
memory allocators. Current memory allocation scheme is further obscured 
by trying hard to allocate just once and reusing/splitting the initial 
chunk.

> Allocations and frees have fortunately been moved to common functions so
> it should be possible to change the allocation code without too much
> hassle. I would prefer to have the same code for runtime and CTFE, but
> for performance reasons that might not be possible if CTFE has a too
> limited feature set.

I learned the hard way that CTFE-able code and high-performance code 
look very much unlike each other. Unless everything is new-ed and not 
deleted we can't have the same code paths. A fair amount of __ctfe is in 
order.

> I think a good idea would be to extract the memory pool code out of the
> regex module. It's not a core part of the problem and could be reused
> elsewhere. Having something named memory pool would also make the code
> clearer. It might be impossible to implement memory pools at CTFE so  we
> could hide that detail away by simulating a memory pool (at CTFE) until
> it becomes possible.

Would be nice to have I guess, std.regex though uses somewhat 
specialized intrusive free-list so it's too niche for generic solution.

> The code in general is quite clear, but low level and more about how to
> do it than what to do. This is not the D-style I know and love, but it
> might be needed for performance.

Yeah, sadly ranges and algorithms were underused. Reasons were multiple, 
for parser it had more to do with CTFE-ablity, for engines more of 
performance concerns.

> Therefore I probably won't be the one
> that starts the implementation of this, but I will try to help if
> someone more experienced in this kind of code takes action.

I might pull this off on occasion. Compared to compilation of regex 
itself it might even run in sensible amount of time.

Anyhow I'm curious what use cases you have in mind for running regexes 
at CTFE.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list