[Issue 10403] memchr optimization for std.algorithm.find

d-bugmail at puremagic.com d-bugmail at puremagic.com
Wed Jun 19 12:57:32 PDT 2013


http://d.puremagic.com/issues/show_bug.cgi?id=10403


monarchdodra at gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra at gmail.com
         AssignedTo|nobody at puremagic.com        |monarchdodra at gmail.com


--- Comment #1 from monarchdodra at gmail.com 2013-06-19 12:57:30 PDT ---
(In reply to comment #0)
> Add an optimization for:
> 
> R find(alias pred = "a == b", R, E)(R haystack, E needle);
> 
> ...to use the c function memchr when pred is the default "a == b" and...
> 
> CASE 1:
> All of the following are true:
> 
> 1) R is an array of elements of type char, byte, ubyte or an enum type whose
> base type is one of those.
> 2) E is integral
> 3) For the element type of R:
> alias Elem = ElementType!R;
> ...the following is true:
> (Elem.min <= needle && needle <= Elem.max)
> This is important because memchr only cares about bitwise equality. If the
> value of needle is beyound the limits of possible values for type Elem, then we
> know needle cannot be found within haystack and the function can just return
> early without searching.
> 
> CASE 2:
> All of the following are true:
> 
> 1) R is an array of elements of type E
> 2) E.sizeof == 1
> 3) For type E equality means bitwise equality

I'm just about finished implementing the above, along with quite a few other
tricks, and I'm getting some 'great' performance improvements.

I implemented "CASE 2".

However, I don't think it is worth implementing "CASE 1": I think the case of
searching for an element that is simply outside of the possible values of the
range's element type should be a very rare case. I mean, who would write:
find(myRangeOfBytes, 5000);
???
I come to the conclusion that the cost of checking the condition all the time
just to reduce the time of a special case is not worth it.

It's kind of like of the "opAssign check for self assignment" issue. If you can
write the code in such a case that the common case goes faster, but self is
more costly, then that is better.

So yeah, I implemented "CASE 2".

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list