Challenge: write a really really small front() for UTF8
dnspies
dspies at ualberta.ca
Mon Mar 24 10:27:31 PDT 2014
Whoops, wrong "original" version. That was the one before I
understood the game. Here mine is fixed (but up to 180 lines - 16
labels = 164 instructions):
http://goo.gl/wjIAVm
On Monday, 24 March 2014 at 17:14:11 UTC, dnspies wrote:
> It seems like mine wasn't being inlined because I had
> carelessly replaced char[] with const(char)[] in the function
> signature (I don't know why that should make a difference, but
> it does)
>
> Going back to my original version (with Andre's trick to avoid
> letting the loop unroll), I get
> dnspies_orig 159 - 16 labels = 143 instructions
>
> However, MFortin1 can be made to do better by taking successive
> slices instead of indexing (and commenting out the check, since
> it no longer helps optimize).
> MFortin1_slice 137 - 13 labels = 124 instructions
>
> http://goo.gl/JAgVK1
>
> On Monday, 24 March 2014 at 13:39:45 UTC, Michel Fortin wrote:
>> On 2014-03-24 06:08:30 +0000, "safety0ff"
>> <safety0ff.dev at gmail.com> said:
>>
>>> Everything seems to work in your corrected versions:
>>> http://dpaste.dzfl.pl/3b32535559ba
>>>
>>> Andrei's version doesn't compile on recent compiler versions
>>> due to goto skipping initialization.
>>
>> Ok, so let's check what happens in actual usage. In other
>> words, what happens when front is inlined and used in a loop.
>> I'm counting the number of assembly lines of this main
>> function for looping through each possible input using the
>> various implementations:
>>
>> http://goo.gl/QjnA2k
>>
>> implementation line count of main in assembly
>> andralex 285 - 9 labels = 276 instructions
>> MFortin1 135 - 13 labels = 122 instructions
>> MFortin2 150 - 14 labels = 136 instructions
>> safety0ff -- not inlined (too big?) --
>> dnspies 161 - 15 labels = 146 instructions
>> CWilliams 233 - 25 labels = 208 instructions
>>
>> For compactness, my first implementation seems to be the
>> winner, both with and without inlining.
>>
>> That said, I think front should only assume a non-empty char[]
>> and thus should check the length before accessing anything
>> after the first byte to protect against incomplete multi-byte
>> sequences such as [0x1000_0000]. So I added this line to my
>> two implementations:
>>
>> if (1+tailLength >= s.length) return dchar.init;
>>
>> Now lets see what happens with those changes:
>>
>> http://goo.gl/XPCGYE
>>
>> implementation line count of main in assembly
>> MFortin1Check 103 - 11 labels = 92 instructions
>> MFortin2Check 135 - 13 labels = 122 instructions
>>
>> Now I'm baffled. Adding a test makes the code shorter? It
>> actually make the standalone functions longer by 8
>> instructions (as expected), but the inlined version is
>> shorter. My guess is that the optimizer creates something
>> entirely different and it turns out that this different
>> version optimises better after inlining.
>>
>> That said, my test "main" function isn't very representative
>> of the general case because the length is statically known by
>> the optimizer. Let's see what it does when the length is not
>> statically known:
>>
>> http://goo.gl/E2Q0Yu
>>
>> implementation line count of main in assembly
>> andralex 384 - 9 labels = 375 instructions
>> MFortin1 173 - 19 labels = 154 instructions
>> MFortin2 182 - 18 labels = 164 instructions
>> safety0ff -- not inlined --
>> dnspies -- not inlined --
>> CWilliams 229 - 23 labels = 206 instructions
>> MFortin1Check 211 - 24 labels = 187 instructions
>> MFortin2Check 218 - 21 labels = 197 instructions
>>
>> So again MFortin1 is the winner for compactness. Still, I
>> maintain that we ought to at least check the length of the
>> array for multi-byte sequences to protect against incomplete
>> sequences that could lay at the end of the string, so I'd
>> favor MFortin1Check.
More information about the Digitalmars-d
mailing list