Reducing the cost of autodecoding

Wed Oct 12 09:03:45 PDT 2016

On Wednesday, 12 October 2016 at 13:53:03 UTC, Andrei 
Alexandrescu wrote:
> So we've had a good run with making popFront smaller. In ASCII 
> microbenchmarks with ldc, the speed is indistinguishable from s 
> = s[1 .. $]. Smaller functions make sure that the impact on 
> instruction cache in larger applications is not high.
>
> Now it's time to look at the end-to-end cost of autodecoding. I 
> wrote this simple microbenchmark:
>
> =====
> import std.range;
>
> alias myPopFront = std.range.popFront;
> alias myFront = std.range.front;
>
> void main(string[] args) {
>     import std.algorithm, std.array, std.stdio;
>     char[] line = "0123456789".dup.repeat(50_000_000).join;
>     ulong checksum;
>     if (args.length == 1)
>     {
>         while (line.length) {
>             version(autodecode)
>             {
>                 checksum += line.myFront;
>                 line.myPopFront;
>             }
>             else
>             {
>                 checksum += line[0];
>                 line = line[1 .. $];
>             }
>         }
>         version(autodecode)
>             writeln("autodecode ", checksum);
>         else
>             writeln("bytes ", checksum);
>     }
>     else
>         writeln("overhead");
> }
> =====
>
> On my machine, with "ldc2 -release -O3 -enable-inlining" I get 
> something like 0.54s overhead, 0.81s with no autodecoding, and 
> 1.12s with autodecoding.
>
> Your mission, should you choose to accept it, is to define a 
> combination front/popFront that reduces the gap.
>
>
> Andrei

This will only work really efficiently with some state on the 
stack.
If we are to support Unicode.