4x faster strlen with 4 char sentinel

Ola Fosheim Grøstad via Digitalmars-d-announce digitalmars-d-announce at puremagic.com
Mon Jun 27 13:43:40 PDT 2016


On Monday, 27 June 2016 at 19:51:48 UTC, Jay Norwood wrote:
> Your link's use of padding pads out with a variable number of 
> zeros, so that a larger data type can be used for the compare 
> operations.  This isn't the same as my example, which is 
> simpler due to not having to fiddle with alignment and data 
> type casting.

That's true, and it is fun to think about different string 
implementations. Just keep in mind that prior to the 90s, text 
was the essential datatype for many programmers and inventing new 
ways to do strings is heavily explored. I remember the first 
exercise we got at the university when doing the OS course was to 
implement "strlen", "strcpy" and "strcmp" in C or machine 
language. It can be fun.

Just keep in mind that the major bottleneck now is loading 64 
bytes from memory into cache. So if you test performance you have 
to make sure to invalidate the caches before you test and test 
with spurious reads over a very large memory area to get 
realistic results.

But essentially, the operation is not heavy, so to speed it up 
you need to predict and prefetch from memory in time, meaning no 
library solution is sufficient. (you need to prefetch memory way 
before your library function is called)



More information about the Digitalmars-d-announce mailing list