Andrei Alexandrescu needs to read this

Thu Oct 24 19:25:29 UTC 2019

On Thursday, 24 October 2019 at 00:53:27 UTC, H. S. Teoh wrote:
> I discovered something very interesting: GNU wc was generally 
> on par with, or outperformed the D versions of the code for 
> files that contained long lines, but performed more poorly when 
> given files that contained short lines.
>
> Glancing at the glibc source code revealed why: glibc's memchr 
> used an elaborate bit hack based algorithm that scanned the 
> target string 8 bytes at a time. This required the data to be 
> aligned, however, so when the string was not aligned, it had to 
> manually process up to 7 bytes at either end of the string with 
> a different algorithm.  So when the lines were long, the 
> overall performance was dominated by the 8-byte at a time 
> scanning code, which was very fast for large buffers.  However, 
> when given a large number of short strings, the overhead of 
> setting up for the 8-byte scan became more costly than the 
> savings, so it performed more poorly than a naïve byte-by-byte 
> scan.

Interesting observation. On the surface it seems this might also 
apply to splitter and find when used on narrow strings. I believe 
these call memchr on narrow strings. A common paradigm is to read 
lines, then call splitter to identify individual fields. Fields 
are often short, even when lines are long.

--Jon