[GSoC] 'Independency of D from the C Standard Library' progress and update thread
Mike Franklin
slavo5150 at yahoo.com
Tue Jun 4 01:11:44 UTC 2019
On Monday, 3 June 2019 at 22:45:28 UTC, Andrei Alexandrescu wrote:
> At 512 lines including tests, it seems on the involved side.
> The benchmarks ought to show a hefty improvement to match. Are
> there benchmark results available?
I did some initial benchmarks at
https://github.com/JinShil/memcpyD when I made the first
feasibility study to see if this project was worth pursuing. The
initial results were encouraging, which is why we're taking it
further in this project.
I'll work with Stefanos to get a more polished implementation
that users can download and run for themselves.
> Quoting the rationale from the motivation in another thread:
>
> 1) C’s implementations are not type-safe and memory-safe.
> 2) C’s implementations have accumulated a lot of cruft over the
> years.
> 3) Cross-compiling is more difficult as now one should have
> available and configured a C runtime and toolchain apart from
> the D runtime. This makes it difficult for D to create
> freestanding software.
> 4) Type-safety and memory safety (bounds-checking etc.)
> 5) Templates to branch to an optimal implementation at
> compile-time.
> 6) Inlining, as the branching in C happens at runtime.
> 7) Compile-Time Function Execution (CTFE) and introspection
> (type info).
>
> My view on formulating motivation is simple: do it like a
> scientist. Argue the facts. If facts are not available, argue
> fundaments and universal principles. If such are not available,
> the motivation is too weak.
Yes, the motivation could be improved, but the time for
motivating this project was 2 months ago, not now. Now the
project is underway, and we need to see it to completion. The
focus now should be on providing feedback on the implementations
not the rationale/motivation.
> (1) checks the "facts" box but has the obvious comeback "then
> how about a 2-line trusted wrapper over memcpy?" that needs to
> be explained. Related, obviously people who reach for memcpy()
> are often not looking for a safe primitive. a[] = b[] is safe,
> syntactically simple, and could lower to anything including
> memcpy.
Part of the motivation is so druntime no longer has a hard
intrinsic dependency on libc. If you just wrap the libc function
you're not acheiving that goal.
Now, that being said, it is way out of the scope of this project
to provide a D implementation of memcpy for all platforms,
architectures and mircoarchitectures that D supports. So, we
need to deal with that.
Before I elaborate further, it's important to understand that
druntime is currently a monolith that is not architected or
structures properly. druntime is supposed to be the language
implementation, not libc bindings, libc++ bindings, windows
bindings, linux bindings, low-level code (whatever that means),
etc.
The language implementation *will* require certain features of
the underlying operating system and hardware. Some of those
features may be provided by libc, but that decision should be
made on a platform-by-platform basis. So what we hope to achieve
with this project is an idiomatic-D memory copy/compare
interface. That interface may simply forward to libc for those
features that don't have an optimized D implementation. Other
platforms may choose to implement a highly optimized
implementation in D. Other platforms may choose to mix the two
(e.g. an optimized D implementation for small copies, and forward
to libc for large copies). Others may choose to just implement a
simple while-loop because they either don't want to obtain a C
toolchain (those cross-compiling to embedded targets) or because
there isn't C implementation available (new platforms like WASM).
This project aims to remove druntime's dependency on libc, but
the platform port of druntime may still choose to depend on it.
That being said you might be wondering why we are bothering to
implement an entire memcpy in D for the x86_64 architecture.
1) because DMD's implementation is suboptimal,
2) to help motivate the entire project
3) to demonstrate D as a first-class systems programming language
4) to set an example and precedent for other plaforms to
potentially follow
Please keep in mind we're trying to expand D to more platforms
include resource-constrained embedded systems, OS programming,
bare-metal applications, and new platforms such as WASM. We want
D to be more easily portable, and that is partically achieved by
making a platform abstraction, independent of libc. libc is a
platform implementation detail.
> (2) is quite specious and really needs some evidence. Is cruft
> in memcpy really an issue? I looked memcpy() implementations a
> while ago but didn't save bookmarks. Did a google search just
> now and found
> https://github.com/gcc-mirror/gcc/blob/master/libgcc/memcpy.c,
> which is very far from cruft-ridden.
That is not the memcpy that is actually on your machine. You can
find the more elaborate implementations here:
https://sourceware.org/git/?p=glibc.git;a=tree;f=sysdeps/x86_64/multiarch;h=14ec2285c0f82b570bf872c5b9ff0a7f25724dfd;hb=HEAD
Another from intel:
https://github.com/DPDK/dpdk/blob/master/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
> I do remember elaborate implementations of memcpy but so are
> (somewhat ironically) the 512 lines of the proposed
> implementation. I found one here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/lib/memcpy_64.S?id=HEAD
>
> No idea of its level of cruftiness, where it's used etc. The
> right way to argue (2) is to provide links to implementations
> that people can look at and decide without doubt, "yep, crufty".
The more elaborate C implementations are typically written in
assembly. They are difficult to follow due to all of the various
techniques to handle misalignment and the cleverness typically
required to achieve the best performance.
It is my hope that this project will explore how D can improve
such implementations by reducing the cleverness to small isolated
inline assembly blocks surrounded by D to make it easier to see
the flow control. I think D can do that.
> (3) is... odd. Doesn't every machine ever come with a C
> implementation including a ready-to-link standard library? If
> not, isn't that a rarity? Again, that should be argued
> preemptively by the motivation section.
Yes its a rarity, but nevertheless an artificial dependency for
druntime.
druntime does not sufficiently utilize libc to justify the hard
dependency. It just needs a few memory utilities and an
allocator. I think it's worthwhile to see if D can do just as
well without libc. In fact, if I had my druthers, I'd remove
libc's malloc altogether today and just add jemalloc to the
druntime repository. Maybe it could even be mechanically
translated to D.
> (4) brings again the wrapper argument
For some platforms, it may just be a wrapper.
> (5) is nice if and only if confirmed by benchmarks
We've already demonstrated this with benchmarks, I'll work with
Stefanos to get them made available, but
https://github.com/JinShil/memcpyD already shows the benefit.
> (6) is also nice under the same conditions as (5)
Yep, see my response to (5)
> (7) again... what's wrong with a wrapper that does if (__ctfe)
I think Stefanos is probably arguing in general about the
design-by-introspection features of D which include CTFE and
other metaprogramming features which is more-or-less the same as
(5). Those benefits have been demonstrated, and we'll work to
make those more apparent in the near future.
That being said, there's nothing ruling out an `if (__ctfe)`
block in the implementation if that's what is determined to be
best.
> With malloc() we're looking at a completely different ballgame.
> Implementing malloc() from scratch is a very serious project
> that needs almost overwhelming motivation. The goal of
> std.experimental.allocator was to offer a flexible framework
> for implementing general and specialized allocators, but simply
> replacing malloc() is more difficult to argue. Also, achieving
> comparable performance will be difficult.
I agree to all of that, but we're going to try it anyway and see
how it does. If all we achieve in the end is just a wrapper that
forwards to libc's malloc and friends, it will still be better
than what we have now, because libc will then be simply an
implementation detail.
Mike
More information about the Digitalmars-d
mailing list