New UTF-8 stride function

Martin Nowak code at dawg.eu
Mon May 27 12:21:33 PDT 2013


On 05/26/2013 10:49 PM, Dmitry Olshansky wrote:
 > If there is anything that come out of UTF-8 discussion is that I decided
 > to dust off my experimental implementation of UTF-8 stride function.
 > Just for fun.
 >
 > The key difference vs std is in handling non-ASCII case.
 > I'm replacing bsr intrinsic with a what I call an "in-register lookup
 > table" (neat stuff that is a piece of cake, thx to CTFE).
 >
 > See unittest/benchmark here:
 > https://gist.github.com/blackwhale/5653927
 >
Looks promising.

 > Test files I used:
 > 
https://github.com/blackwhale/gsoc-bench-2012/blob/master/arwiki-latest-all-titles-in-ns0
 >
 > 
https://github.com/blackwhale/gsoc-bench-2012/blob/master/dewiki-latest-all-titles-in-ns0
 >
 > 
https://github.com/blackwhale/gsoc-bench-2012/blob/master/dewiki-latest-all-titles-in-ns0
 >
 > 
https://github.com/blackwhale/gsoc-bench-2012/blob/master/ruwiki-latest-all-titles-in-ns0
 >
These are huge and most likely the performance is limited by the memory 
bandwith.



More information about the Digitalmars-d mailing list