Challenge: write a really really small front() for UTF8

dnspies dspies at ualberta.ca
Mon Mar 24 10:27:31 PDT 2014


Whoops, wrong "original" version.  That was the one before I 
understood the game. Here mine is fixed (but up to 180 lines - 16 
labels = 164 instructions):

http://goo.gl/wjIAVm

On Monday, 24 March 2014 at 17:14:11 UTC, dnspies wrote:
> It seems like mine wasn't being inlined because I had 
> carelessly replaced char[] with const(char)[] in the function 
> signature (I don't know why that should make a difference, but 
> it does)
>
> Going back to my original version (with Andre's trick to avoid 
> letting the loop unroll), I get
> dnspies_orig   159 - 16 labels = 143 instructions
>
> However, MFortin1 can be made to do better by taking successive 
> slices instead of indexing (and commenting out the check, since 
> it no longer helps optimize).
> MFortin1_slice 137 - 13 labels = 124 instructions
>
> http://goo.gl/JAgVK1
>
> On Monday, 24 March 2014 at 13:39:45 UTC, Michel Fortin wrote:
>> On 2014-03-24 06:08:30 +0000, "safety0ff" 
>> <safety0ff.dev at gmail.com> said:
>>
>>> Everything seems to work in your corrected versions:
>>> http://dpaste.dzfl.pl/3b32535559ba
>>> 
>>> Andrei's version doesn't compile on recent compiler versions 
>>> due to goto skipping initialization.
>>
>> Ok, so let's check what happens in actual usage. In other 
>> words, what happens when front is inlined and used in a loop. 
>> I'm counting the number of assembly lines of this main 
>> function for looping through each possible input using the 
>> various implementations:
>>
>> http://goo.gl/QjnA2k
>>
>> implementation   line count of main in assembly
>> andralex         285 -  9 labels = 276 instructions
>> MFortin1         135 - 13 labels = 122 instructions
>> MFortin2         150 - 14 labels = 136 instructions
>> safety0ff         -- not inlined (too big?) --
>> dnspies          161 - 15 labels = 146 instructions
>> CWilliams        233 - 25 labels = 208 instructions
>>
>> For compactness, my first implementation seems to be the 
>> winner, both with and without inlining.
>>
>> That said, I think front should only assume a non-empty char[] 
>> and thus should check the length before accessing anything 
>> after the first byte to protect against incomplete multi-byte 
>> sequences such as [0x1000_0000]. So I added this line to my 
>> two implementations:
>>
>>   if (1+tailLength >= s.length) return dchar.init;
>>
>> Now lets see what happens with those changes:
>>
>> http://goo.gl/XPCGYE
>>
>> implementation   line count of main in assembly
>> MFortin1Check    103 - 11 labels =  92 instructions
>> MFortin2Check    135 - 13 labels = 122 instructions
>>
>> Now I'm baffled. Adding a test makes the code shorter? It 
>> actually make the standalone functions longer by 8 
>> instructions (as expected), but the inlined version is 
>> shorter. My guess is that the optimizer creates something 
>> entirely different and it turns out that this different 
>> version optimises better after inlining.
>>
>> That said, my test "main" function isn't very representative 
>> of the general case because the length is statically known by 
>> the optimizer. Let's see what it does when the length is not 
>> statically known:
>>
>> http://goo.gl/E2Q0Yu
>>
>> implementation   line count of main in assembly
>> andralex         384 -  9 labels = 375 instructions
>> MFortin1         173 - 19 labels = 154 instructions
>> MFortin2         182 - 18 labels = 164 instructions
>> safety0ff         -- not inlined --
>> dnspies           -- not inlined --
>> CWilliams        229 - 23 labels = 206 instructions
>> MFortin1Check    211 - 24 labels = 187 instructions
>> MFortin2Check    218 - 21 labels = 197 instructions
>>
>> So again MFortin1 is the winner for compactness. Still, I 
>> maintain that we ought to at least check the length of the 
>> array for multi-byte sequences to protect against incomplete 
>> sequences that could lay at the end of the string, so I'd 
>> favor MFortin1Check.



More information about the Digitalmars-d mailing list