Type safety could prevent nuclear war

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Thu Feb 4 15:48:33 PST 2016


On Thu, Feb 04, 2016 at 11:21:54PM +0000, tsbockman via Digitalmars-d wrote:
[...]
> Definitely. What puzzles me about the winning entry, though, is that the
> compiler and/or linker should be able to trivially detect the type mismatch
> *after* the preprocessor pass(es) are already done.

It cannot, because C symbols are not mangled. The function name uniquely
identifies the function, and the signature is not encoded anywhere.

The linker knows nothing about types or parameters; all it knows is that
within offset X of binary blob B, there's a binary number (usually a 32-
or 64-bit address) associated with a symbol that it needs to replace
with the value (i.e., address) of that symbol, which it obtains from the
object file that defines that symbol.

So as far as the linker is concerned, the function names match up, and
that's all there is to it.

C provides zero protection against calling functions with mismatched
parameters if the caller is not in the same file, and does not have the
right declaration. E.g.:

	/* module1.c */
	void func(int a, int b) { ... }

	/* module2.c */
	extern int func(double x); /* I'm too lazy to #include a header */
	int main() {
		int x = func(1.0); /* kaboom */
	}

In theory, this problem is solved by #include'ing the appropriate header
file, but even that isn't free from accidents like forgetting to update
the header after you change the function signature.  Of course, most
sane C projects will also #include the header in the file that defines
the function, in which case, finally, the compiler will catch the
mistake. But you can see just how fragile this is, and how many points
of failure it has, and, believe it or not, there *are* still C projects
out there that don't follow the convention of one header per .c file,
and of those that do, a frightening number do not #include the header in
the .c file.

This isn't the whole story, either. Even if you follow said conventions
to prevent function signature mismatches, problems can still occur. For
instance, once I've had to debug a mysterious crash problem in an
enterprise project that, seemingly, cannot be found in the code.  Turns
out, that it was caused by two shared libraries that defined two
different functions under the same name. Since the conflicting functions
are in separately-compiled libraries, the compiler is oblivious to the
conflict. Furthermore, the linker doesn't detect it either, because,
being shared libraries, all the linker knows is that it found symbol X
in library1, so it didn't bother looking for symbol X again in library2
which is processed afterward. An unrelated code change caused the order
of libraries linked to change, and suddenly now the linker finds symbol
X in library2 first, leading to the function call being linked to the
wrong implementation.  So at runtime, kaboom.

Name mangling singlehandedly solves all of the above problems.


T

-- 
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth


More information about the Digitalmars-d mailing list