Sorry, the list wouldn't post my accidental double paste anyhow because it was too big. Here's the disassembly of partition, showing only back() is getting called and pred seems to be getting inlined, contrary to what I had believed before. So changing enforce() in back() to assert() gets rid of about half the overhead, but I have no idea where the other half is coming from.<br>
<br><span style="font-family: courier new,monospace;">_D3std9algorithm129__T8sortImplS793std10functional54__T13binaryFunImplVAyaa5_61203c2062VAyaa1_610889BE3F182A6EA5575A2D298A20D100 PROC NEAR</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">; COMDEF _D3std9algorithm129__T8sortImplS793std10functional54__T13binaryFunImplVAyaa5_61203c2062VAyaa1_610889BE3F182A6EA5575A2D298A20D100</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> sub esp, 36 ; 0000 _ 83. EC, 24</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ecx, dword ptr [esp+2CH] ; 0003 _ 8B. 4C 24, 2C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push ebx ; 0007 _ 53</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ebx, dword ptr [esp+2CH] ; 0008 _ 8B. 5C 24, 2C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> test ebx, ebx ; 000C _ 85. DB</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> push ebp ; 000E _ 55</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push esi ; 000F _ 56</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov esi, eax ; 0010 _ 89. C6</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push edi ; 0012 _ 57</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> jnz ?_020 ; 0013 _ 75, 0E</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> pop edi ; 0015 _ 5F</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov eax, ebx ; 0016 _ 8B. C3</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edx, ecx ; 0018 _ 8B. D1</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> pop esi ; 001A _ 5E</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> pop ebp ; 001B _ 5D</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> pop ebx ; 001C _ 5B</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> add esp, 36 ; 001D _ 83. C4, 24</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> ret 8 ; 0020 _ C2, 0008</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">?_020: mov dword ptr [esp+3CH], ecx ; 0023 _ 89. 4C 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edx, dword ptr [esp+3CH] ; 0027 _ 8B. 54 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+38H], ebx ; 002B _ 89. 5C 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebx, dword ptr [esp+38H] ; 002F _ 8B. 5C 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+14H], ebx ; 0033 _ 89. 5C 24, 14</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+18H], edx ; 0037 _ 89. 54 24, 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> cmp dword ptr [esp+38H], 0 ; 003B _ 83. 7C 24, 38, 00</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> je ?_023 ; 0040 _ 0F 84, 00000088</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">?_021: mov edx, dword ptr [esp+3CH] ; 0046 _ 8B. 54 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+38H] ; 004A _ 8B. 44 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ebx, dword ptr [edx] ; 004E _ 8B. 1A</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push dword ptr [esi+0CH] ; 0050 _ FF. 76, 0C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> push dword ptr [esi+8H] ; 0053 _ FF. 76, 08</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+18H], edx ; 0056 _ 89. 54 24, 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> call _D3std5array12__T4backTAiZ4backFNcAiZi ; 005A _ E8, 00000000(rel)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> cmp dword ptr [eax], ebx ; 005F _ 39. 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> jle ?_022 ; 0061 _ 7E, 31</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edi, dword ptr [esp+38H] ; 0063 _ 8B. 7C 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ecx, dword ptr [esp+10H] ; 0067 _ 8B. 4C 24, 10</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> dec edi ; 006B _ 4F</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> lea edx, [ecx+4H] ; 006C _ 8D. 51, 04</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebx, dword ptr [esp+14H] ; 006F _ 8B. 5C 24, 14</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> dec ebx ; 0073 _ 4B</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+3CH], edx ; 0074 _ 89. 54 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov edx, dword ptr [esp+18H] ; 0078 _ 8B. 54 24, 18</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+14H] ; 007C _ 8B. 44 24, 14</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+38H], edi ; 0080 _ 89. 7C 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> add edx, 4 ; 0084 _ 83. C2, 04</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+14H], ebx ; 0087 _ 89. 5C 24, 14</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+18H], edx ; 008B _ 89. 54 24, 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> jmp ?_025 ; 008F _ E9, 000000AA</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">?_022: push dword ptr [esp+3CH] ; 0094 _ FF. 74 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push dword ptr [esp+3CH] ; 0098 _ FF. 74 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> call _D3std5array12__T4backTAiZ4backFNcAiZi ; 009C _ E8, 00000000(rel)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebx, dword ptr [eax] ; 00A1 _ 8B. 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> push dword ptr [esi+0CH] ; 00A3 _ FF. 76, 0C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push dword ptr [esi+8H] ; 00A6 _ FF. 76, 08</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> call _D3std5array12__T4backTAiZ4backFNcAiZi ; 00A9 _ E8, 00000000(rel)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> cmp dword ptr [eax], ebx ; 00AE _ 39. 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> jg ?_024 ; 00B0 _ 7F, 2E</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edi, dword ptr [esp+38H] ; 00B2 _ 8B. 7C 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> dec edi ; 00B6 _ 4F</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+38H] ; 00B7 _ 8B. 44 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+38H], edi ; 00BB _ 89. 7C 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edx, dword ptr [esp+3CH] ; 00BF _ 8B. 54 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+3CH], edx ; 00C3 _ 89. 54 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> cmp dword ptr [esp+38H], 0 ; 00C7 _ 83. 7C 24, 38, 00</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> jnz ?_022 ; 00CC _ 75, C6</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">?_023: mov edx, dword ptr [esp+18H] ; 00CE _ 8B. 54 24, 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+14H] ; 00D2 _ 8B. 44 24, 14</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> pop edi ; 00D6 _ 5F</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> pop esi ; 00D7 _ 5E</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> pop ebp ; 00D8 _ 5D</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> pop ebx ; 00D9 _ 5B</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> add esp, 36 ; 00DA _ 83. C4, 24</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> ret 8 ; 00DD _ C2, 0008</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">?_024: push dword ptr [esp+3CH] ; 00E0 _ FF. 74 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> push dword ptr [esp+3CH] ; 00E4 _ FF. 74 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> call _D3std5array12__T4backTAiZ4backFNcAiZi ; 00E8 _ E8, 00000000(rel)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov edi, eax ; 00ED _ 89. C7</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov edx, dword ptr [esp+3CH] ; 00EF _ 8B. 54 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebp, edx ; 00F3 _ 89. D5</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+38H] ; 00F5 _ 8B. 44 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">; Note: Zero displacement could be omitted</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ecx, dword ptr [edx] ; 00F9 _ 8B. 4A, 00</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebx, dword ptr [edi] ; 00FC _ 8B. 1F</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> lea edx, [edx+4H] ; 00FE _ 8D. 52, 04</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [ebp], ebx ; 0101 _ 89. 5D, 00</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+38H] ; 0104 _ 8B. 44 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> dec eax ; 0108 _ 48</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [edi], ecx ; 0109 _ 89. 0F</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ebx, dword ptr [esp+14H] ; 010B _ 8B. 5C 24, 14</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ecx, dword ptr [esp+18H] ; 010F _ 8B. 4C 24, 18</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+38H], eax ; 0113 _ 89. 44 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> dec ebx ; 0117 _ 4B</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+14H] ; 0118 _ 8B. 44 24, 14</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+14H], ebx ; 011C _ 89. 5C 24, 14</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> add ecx, 4 ; 0120 _ 83. C1, 04</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov ebx, dword ptr [esp+38H] ; 0123 _ 8B. 5C 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+18H], ecx ; 0127 _ 89. 4C 24, 18</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> dec ebx ; 012B _ 4B</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov eax, dword ptr [esp+38H] ; 012C _ 8B. 44 24, 38</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+3CH], edx ; 0130 _ 89. 54 24, 3C</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov ecx, edx ; 0134 _ 8B. CA</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;"> mov dword ptr [esp+38H], ebx ; 0136 _ 89. 5C 24, 38</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> mov dword ptr [esp+3CH], ecx ; 013A _ 89. 4C 24, 3C</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">?_025: cmp dword ptr [esp+38H], 0 ; 013E _ 83. 7C 24, 38, 00</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> jne ?_021 ; 0143 _ 0F 85, FFFFFEFD</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">; Note: Immediate operand could be made smaller by sign extension</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;"> jmp ?_023 ; 0149 _ E9, FFFFFF80</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">_D3std9algorithm129__T8sortImplS793std10functional54__T13binaryFunImplVAyaa5_61203c2062VAyaa1_610889BE3F182A6EA5575A2D298A20D100 ENDP</span><br>
<br><div class="gmail_quote">On Fri, Jul 2, 2010 at 5:25 PM, David Simcha <span dir="ltr"><<a href="mailto:dsimcha@gmail.com">dsimcha@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Sorry for the double paste. Please ignore everything before the second time you see <span style="font-family: courier new,monospace;">_D3std9algorithm129__T8sortImplS793std10functional54__T13binaryFunImplVAyaa5_61203c2...</span>
</blockquote></div><br>