Fantastic exchange from DConf

Patrick Schluter via Digitalmars-d digitalmars-d at puremagic.com
Tue May 9 11:13:50 PDT 2017


On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
> On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via
[...]
> Ouch.  Haha, even I forgot about this particularly lovely 
> aspect of C. Hooray, freely call functions without declaring 
> them, and "obviously" they return int! Why not?
>
> There's an even more pernicious version of this, in that the 
> compiler blindly believes whatever you declare a symbol to be, 
> and the declaration doesn't even have to be in a .h file or 
> anything even remotely related to the real definition. Here's a 
> (greatly) reduced example (paraphrased from an actual bug I 
> discovered):
>
> 	module.c:
> 	-------
> 	int get_passwd(char *buf, int size);

yeah, this is a code smell. A not static declared function 
prototype in a C file. Raises the alarm bells automatically now. 
The same issue but much more frequent to observe, extern variable 
declaration in .c files. That one is really widespread and few 
see it as an anti-pattern. An extern global variable should 
always be put in the header file, never in the C file. Exactly 
for the same reason as your example with the wrong prototype 
below: non matching types the linker will join wrongly.

> 	int func() {
> 		char passwd[100];
> 		if (!get_passwd(buf, 100)) return -1;
> 		do_something(passwd);
> 	}
>
> 	passwd.c:
> 	---------
> 	void get_passwd(struct user_db *db, struct login_record *rec) {
> 		... // stuff
> 	}
>
> 	old_passwd.c:
> 	-------------
> 	/* Please don't use this code anymore, it's deprecated. */
> 	/* ^^^^ gratuitous useless comment */
> 	int get_passwd(char *buf, int size) { ... /* old code */ }
>
> Originally, in the makefile, module.o is linked with 
> libutil.so, which in turn is built from old_passwd.o and a 
> bunch of other stuff. Later on, passwd.o was added to 
> libotherutil.so, which was listed after libutil.so in the 
> linker command, so the symbol conflict was masked because the 
> linker found the libutil.so version of get_passwd first.
>
> Then one day, somebody changed the order of libraries in the 
> makefile, and suddenly func() mysteriously starts 
> malfunctioning because get_passwd now links to the wrong 
> version of the function!
>
> Worse yet, the makefile was written to be "smart", as in, it 
> uses wildcards to pick up .so files (y'know, us lazy 
> programmers don't wanna have to manually type out the name of 
> every library).

Yeah, we also had makefiles using wildcards. It took a long time 
but I managed to get rid of them.

> So when somebody tried to fix this bug by removing old_passwd.o 
> from libotherutil.so altogether, the bug was still happening in 
> other developers' machines, because a stale copy of the old 
> version of libotherutil.so was still left in their source tree, 
> so when *they* built the executable, it contains the bug, but 
> the bug vanishes when built from a fresh checkout. Who knows 
> how many hours were wasted chasing after this heisenbug.
>
>
> [...]
>> > 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
>> 
>> Worse, you didn't check the return of fclose() on writing FILE.
>> fclose() can fail if the disk was full. As the FILE is 
>> buffered, the
>> last fwrite might not have flushed it yet. So it is the 
>> fclose() that
>> will try to write the last block and that can fail, but the app
>> wouldn't be able to even report it.
> [...]
>
> Haha, you're right.  NONE of the code I've ever dealt with even 
> considers this case. None at all.  In fact, I don't even 
> remember the last time I've seen C code that bothers checking 
> the return value of fclose().  Maybe I've written it *once* in 
> my lifetime when I was young and naïve, and actually bothered 
> to notice the documentation that fclose() may sometimes fail. 
> Even the static analysis tool we're using doesn't report it!!

I discovered that one only a few month ago. I have now around 30 
places in our code base to fix. It's only important for writing 
FILE. Reading FILE can ignore the return values.

>
> So again Walter was spot on: fill up the disk to 99% full, and 
> 99% of C programs would start malfunctioning and showing all 
> kinds of odd behaviours, because they never check the return 
> code of printf, fprintf, or fclose, or any of a whole bunch of 
> other syscalls that are regularly *assumed* to just work, when 
> in reality they *can* fail.
>
> The worst part of all this is, this kind of C code is prevalent 
> everywhere in C projects, including those intended for 
> supposedly security-aware software.  Basically, the language 
> itself is just so unfriendly to safe coding practices that it's 
> nigh impossible to write safe code in it.  It's *theoretically* 
> possible, certainly, but in practice nobody writes C code that 
> way.  It is a scary thought indeeed, how much of our current 
> infrastructure relies on software running this kind of code.  
> Something's gotta give, eventually. And it ain't gonna be 
> pretty when it all starts crumbling down.

Agreed. That's why I'm learning D now, it's probably the only 
language that will be able to replace progressively our C code 
base in a skunk work fashion. I wanted to do it already 5 or 6 
years ago but couldn't as we were on Solaris/SPARC back then. Now 
that we have migrated to Linux-AMD64 there's not much holding us 
back. Oracle client is maybe still an issue though.



More information about the Digitalmars-d mailing list