D performance guideline (request)
janderson
askme at me.com
Sat Apr 12 12:45:40 PDT 2008
bobef wrote:
> Hi,
>
> I would like to request a guideline how to squeeze every last nanosecond out of D. I am writing this scripting language - FlowerScript, and I really need to take every spare bit... But the problem is... I am a noob. D performs really well, of course, but lame people like me could make even D slow :) I know here are a lot of people that are very good programmers with broad knowledge how things work and a lot of experiene, so I would like to ask you to share your experience - what to do when every tick matters. For example one could use pointers for indexing arrays, to bypass the bounds checking. Yes, it is not that pretty but gives a big boost. I am trying to experiment with things, but sometimes it is really strange. For example I have few ifs. I rearrange them a bit. Sometimes I move one if inside other and it becomes slower, sometimes I simply add curly braces and it becomes faster. Or I wrap a block of code in a try {} and it becomes faster. Why is that? If I wrap every
line of code in try, will I make it fly ? :D
>
> Thanks,
> bobef
I imagine you understand this however I'll point it out anyway. The
best optimizations programmers in the world for any given language know
that only a small amount of the code actually matters to performance.
They always profile to see where that is. If they tried to optimize the
entire program they would be spending less time on the parts that really
matter.
Bound checking can be turned off at release time so I wouldn't worry
about that.
So having said that, beware of using functions like getchar() and read
blocks of data at a time. Think about high level optimizations first,
these will normally give you the biggest bang. For instance the fastest
code you can run is the code that is never run.
Try to minimize memory allocation by allocating in bigger blocks. For
instance with arrays you can do this to reserve space.
array.length = 100
array.length = 0
array now has 100 spaces reserved
Don't manually flush the GC until you have idle time. Its a very slow
function.
When you do find a function that really is causing trouble here are a
few tips:
1) Reduce branching. Branching is bad for these reasons:
- It can cause a cache miss
- The CPU prefetcher and program optimizer will stall
2) Avoid using linked lists. If you do absolutely need one allocate
each node from an array. 99% of the time you will be spending more time
traversing the array then insertion and deletion. Inserting/deleting
into an array can be faster then a linked list if its at the end (and
the memory is already reserved). The exception is generally when the
link list is a memory allocator and that's its only responsibility.
3) Use foreach instead of for. The compiler can sometimes optimize
these better.
4) Try to do more stuff at compile time using templates and compile time
functions however don't fall victim to making the programs memory
footprint so big it actually runs slower.
5) In your innermost loops look at the generated asm code to figure out
what it is doing (last resort).
6) Avoid using virtual functions in your innermost loops if they cause a
problem. Note that if the virtual function body contains more then 10
instructions you *may* be wasting your time removing the virtual. The
amount of time processing those 10 instructions is can be much larger
then the time cost of the virtual function. Although like branching
they can cause a stall. The reasons
- Compiler can't inline virtual's
- Cachemiss is almost certain because the Vtable is located far away
from the
- The CPU prefetcher and program optimizer will stall
- Extra pointer lookups.
7) Try to push things around as blocks of memory rather then individual
pieces.
8) Profile, Profile, Profile and let use know your results.
I hope that was helpful.
-Joel
More information about the Digitalmars-d
mailing list