Reducing the cost of autodecoding

Tue Oct 25 14:46:30 PDT 2016

On Wednesday, 12 October 2016 at 13:53:03 UTC, Andrei 
Alexandrescu wrote:
>
> Now it's time to look at the end-to-end cost of autodecoding.

Some food for thought:

- front necessarily needs to compute the number of bytes to 
advance.
- We can't change the API to share data between front and 
popFront, however we can create a situation where a pure function 
gets duplicate calls removed by the compiler.

Since we require that the ascii test gets inlined into the caller 
of front/popFront to improve ascii performance, we have a 
situation similar to this:

alias Result = Tuple!("codepoint",dchar,"advance",int);
auto decode(const char[] str) pure
{ pragma(inline,true);
   if (str[0] < 0x80) return Result(str[0],1);
   else return decodeNonAscii(str);
}

dchar front(const char[] str) pure
{ pragma(inline,true);
   return str.decode.codepoint;
}

void popFront(ref const(char)[] str)
{ pragma(inline,true);
   return str[str.decode.advance..$];
}

When used in front/popFront pairs, the duplicated decode calls 
get merged and we don't do any duplicate work (unlike the current 
situation.)

Unfortunately, it's not possible to achieve the best code 
generation due to missed optimizations by the compilers (I 
haven't tried GDC.)
I've reported a highly reduce case to LDC github issue #1851 / 
LDC subforum.

Once we have this is possible only the decodeNonAscii, perhaps 
using the DFA method linked by ketmar.

P.S. I am aware that this pessimises popFront for code which only 
counts codepoints without inspecting them.