Optimization fun

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Thu Nov 6 15:15:04 PST 2014


On 11/6/2014 2:58 PM, H. S. Teoh via Digitalmars-d wrote:
> 2) Auto-decoding is evil: the solver currently uses a naïve char[]
> representation for the puzzle board, and uses countUntil() extensively
> to locate things on the board. The original code had:
>
> 	auto offset = currentBoard.countUntil(ch);
>
> which, of course, triggers autodecoding (even though there is actually
> no reason to do so: ch is a char, and therefore countUntil could've used
> strchr instead). Simply changing that to:
>
> 	auto offset = currentBoard.representation.countUntil(ch);
>
> gave an additional 20%-30% performance boost.

I've argued this point for over a year, apparently without convincing anyone :-(

BTW, use .byChar. instead of .representation. because the former leaves it typed 
as 'char' and the latter converts it to 'ubyte'.


> 3) However, countUntil remains at the top of the profiler's list of
> hotspots. DMD's optimizer was rather disappointing here, so I looked at
> gdc's output instead. Digging into the disassembly, I saw that it was
> using a foreach loop over the char[], but for some reason even gdc -O3
> failed to simplify that loop significantly (I'm not sure why -- maybe
> what threw off the optimizer is the array indexing + length check
> operation, where in C one would normally jump bump the pointer
> instead?). Eventually, I tried writing a manual loop:
>
>              auto ptr = current.ptr;
>              auto c = current.length;
>              while (c > 0 && *ptr != m.id)
>                  ptr++;
>              cert.headOffset = ptr - current.ptr;
>
> This pushed gdc far enough in the right direction that it finally
> produced a 3-instruction inner loop. Strangely enough, the assembly had
> no check for (c > 0)... could it be because there is an assert following
> the loop claiming that that will never happen, therefore gdc elided the
> check? I thought Walter hasn't gotten around to implementing
> optimizations based on assert yet??  Anyway, with this optimization, I
> managed to shave off another 3%-5% of the total running time.

Walter hasn't gotten around to implementing optimizations in GDC, that's fer 
sure! :-)



More information about the Digitalmars-d mailing list