Inline assembler in D and LDC, round 2

Thu Feb 5 09:59:06 PST 2009

On Thu, Feb 5, 2009 at 5:46 PM, Don <nospam at nospam.com> wrote:
> Tomas Lindquist Olsen wrote:
>>
>> On Thu, Feb 5, 2009 at 2:42 PM, Frits van Bommel
>> <fvbommel at remwovexcapss.nl> wrote:
>>>
>>> Don wrote:
>>>>
>>>> Frits van Bommel wrote:
>>>>>
>>>>> Walter Bright wrote:
>>>>>>
>>>>>> Frits van Bommel wrote:
>>>>>>>
>>>>>>> Is it really that hard? Can't you just detect this case (non-void
>>>>>>> function without a 'return' at the end but with inline asm inside)?
>>>>>>>
>>>>>>> Since the compiler should know the calling convention[1], the
>>>>>>> register
>>>>>>> that will contain the return value of the function should be a simple
>>>>>>> lookup
>>>>>>> (based on target architecture, cc and return type).
>>>>>>> Just add that register as an output of the inline asm and return
>>>>>>> it...
>>>>>>
>>>>>> dmd doesn't attempt to figure out which register is the return value.
>>>>>> It
>>>>>> just assumes that the registers specified by the ABI for the
>>>>>> function's
>>>>>> return type have the proper return value in them.
>>>>>
>>>>> That isn't an option for LDC, which is why I suggested another
>>>>> approach.
>>>>
>>>> What's the difference? Walter's approach assumes there's a "return EAX;"
>>>> at the end of every function returning an int, for example; your
>>>> approach
>>>> seems to be to add it.
>>>
>>> His approach depends on DMD directly emitting x86 machine code, so it can
>>> just emit 'RET' and be done with it.
>>>
>>> LDC on the other hand needs to emit LLVM asm, which requires it to
>>> specify
>>> an explicit return value. My approach is a way to extract that return
>>> value
>>> from the inline asm, allowing it to emulate DMD behavior within the LLVM
>>> IR.
>>>
>>
>> I had really hoped I didn't have to do something like this, but I
>> can't come up with a better approach. I just hope it actually works
>> when I'm done ...
>> Also I have no idea if code quality is going to be optimal. I imagine
>> people write code like this for efficiency, if LLVM adds extra
>> instructions there is little point in writing code like this for LDC,
>> and we'd want to version things in any case, providing a true naked
>> version for LDC. In this case I'm not sure it's worth it to actually
>> do this work in the first place.
>
> The only reason a function like this isn't written as naked, is so that it
> has a chance to be inlined. If that's impossible with this syntax on all
> compilers, there doesn't seem much point - it might as well be illegal.
>
> If D provided a "return EAX,EDX;" fake asm instruction, would inlining be
> possible?
>

The approach Fritz mentions should still allow inlining. Having a fake
asm instruction like that could make it a bit simpler to implement
this though, since it would be up to the programmer to know the ABI,
not our asm translator frontend. Otherwise it seems to me to be the
same thing really.

At the moment, LDC won't inline anything containing inline asm, but
this restriction could be loosened a bit. The reason we disable
inlining right now, is that if the asm contains labels, and the
function is inlined, LLVM doesn't rewrite the labels, and thus you
might get conflicting labels when you get to assembling. I asked on
the LLVM IRC channel about this, but it's probably not going to be
fixed. The argument was that GCC has the same restriction for extended
inline asm expressions, if you use labels, you must also manually mark
the function with a never-inline function attribute. This might change
when LLVM gets its own assembler I guess..

Another thing is that if inlining is the main reason for functions
like these, perhaps it would be better to somehow get this
optimization into LLVM itself? There is already a pass that tries to
lower common C library function calls...

Yet another thing about inlining with LDC is that, currently the DMD
inliner is disabled. Some of the AST rewrites it does broke our
codegen last time I tried, and we simply haven't tried turning it back
on since. This means that LDC will only inline when it has access to a
LLVM IR representation of the function, this basically means that only
functions from the same module will be inlined, or template functions
- which are always emitted. This is going to change once we get proper
LTO support into LDC, and for now people can still compile to .bc
files instead of .o, and link manually using LLVM tools to get this
feature, so it's not that critical imho.

I guess I'll investigate how much LLVM can help with providing me the
register details to implement something that works automagically... It
just feels wrong to have to duplicate all that information...

</end rant>