[GSoC] 'Independency of D from the C Standard Library' progress and update thread

Thu Sep 5 23:33:07 UTC 2019

On Thursday, 5 September 2019 at 22:56:30 UTC, H. S. Teoh wrote:
>
> That's pretty scary that LLVM does that. It shakes my 
> confidence in LLVM a little. OTOH, the identifier "memcpy" is 
> pretty unique and practically universally understood to mean 
> C's implementation of it, so it's a reasonably safe assumption. 
> Of course, if you ever wish to override memcpy() with something 
> that does something *other* than memcpy, you could potentially 
> have a vector for Thompson-style backdoors (function does one 
> thing when called, does something else when optimizer picks it 
> up).
>

I don't like it either. Although, I _think_ that you can 
specifically
set this off. Or that it is done by specific flags. I'd have to 
check.

>
> But that seems to me to be quite backwards.  If DMD were to 
> target systems that don't have libc, which AFAIK it currently 
> doesn't, we'd already have to do porting work in the form of 
> how codegen is done. Then whatever implementation of memcpy & 
> co you end up with, will simply become a part of this codegen 
> implementation.  It could be instructions directly produced by 
> the backend, it could be calling a druntime function version'd 
> by that specific platform, etc..  But it'd be a 
> platform-specific, dmd-specific thing, not something generic 
> that applies across all platforms that D might target, and not 
> something that, e.g., GDC or LDC would use.
>

I don't know if I understood this correctly.
For memcpy() et al to become part of the compiler codegen, they 
have
to be recognized as intrinsics. Like LLVM does. Is this what you 
refer to ?
Because that's another (interesting) discussion.

I was talking in the assymption that they're handled as just 
functions (as now),
and things like a[] = b[] just call memcpy().
In that case, it doesn't pay to write arch-specific (meaning, the 
function
implementor, not the compiler) implementation. Because that can't 
be leveraged
across architectures (or you have to write a specific one for 
each which is
not a good goal because of maintenance).

Even if there was an LLVM-like thing where you can e.g. call 
vector extension
intrinsics, but these are lowered to whatever arch-specific 
thing. Even if the
arch does not have the concept of vectorization. Even then, it 
would be better
to focus on the algorithmic part, as the translation of the 
compiler would
be relatively basic.

I hope the above made _some_ sense. I feel I didn't articulate my 
thoughts
perfectly.

- Stefanos