In the case of high memory usage the input string is unlikely to be in cache, so may be it's better to optimize cache misses instead of computation speed.