Question about CPU caches and D context pointers
Etienne
etcimon at gmail.com
Mon Feb 17 19:15:58 PST 2014
I've had his question at the back of my mind and I know it's
probably related to back-end optimizations but I'm taking a
chance to see if anyone knows anything.
I know everything about how insignificant the speed difference
may be, but keep in mind this is to further my low-level
understandings. Here's an example to illustrate the question
because it's quite complicated (to me):
#1 contextual function
struct Contents {
ubyte[] m_buffer;
this(){
m_buffer = new ubyte[4092];
}
rcv(string str){
m_buffer ~= str;
}
flush(){
send_4092_bytes_of_data_to_final_heap_buffer()
m_buffer.reset();
}
}
vs..
#2 context-less function
rcv(string str){
send_small_bytes_of_data_to_final_heap_buffer(str);
}
The first case is the struct. When entering rcv() function, I
know the pointer and length of m_buffer are on the stack at that
point. That's pretty damn fast to access b/c the CPU caches keep
these at level 1 through the whole routine. However, It's not
obvious to me if the memory where m_buffer points to will stay in
the CPU cache if there's 5 consecutive calls or so to this same
routine in the same thread. Also note, it will flush to another
buffer, so there's more heap roundtrips with buffers if the CPU
cache isn't efficient.
The second case (context-less) just sends the string right
through to the final allocation procedure (another buffer), and
the string stays a function parameter so it's on the stack, thus
in the CPU cache through every call frame until the malloc takes
place (1 heap roundtrip regardless of any optimization).
So, would there be any chance for the m_buffer's pointee region
to stay in the CPU cache if there's thousands of consecutive
calls to the struct's recv, or do I forcefully have to keep the
data on the stack and send it straight to the allocator? Is there
an easy way to visualize how the CPU cache empties or fills
itself, or to guarantee heap data stays in there without using
the stack?
I'm sorry if the question seems complicated, I read everything
Ulrich Drepper had to say in What every programmer should know
about memory, and I still have a bit of a hard time visualizing
the question myself.
More information about the Digitalmars-d-learn
mailing list