Interesting memory safety topic

H. S. Teoh hsteoh at quickfur.ath.cx
Tue Feb 12 21:31:35 UTC 2019


On Tue, Feb 12, 2019 at 08:25:24PM +0000, Eduard Staniloiu via Digitalmars-d wrote:
> Something that caught my attention on Reddit’s r/cpp
> 
> “Microsoft: 70 percent of all security bugs are memory safety issues”
[...]

Walter is right on the money about memory safety becoming an
increasingly important problem.

Based on my experience working with C/C++ codebases, I'd say that one of
the key causes of memory safety problems is the decay of arrays into
pointers in C/C++, which Walter has rightly said will ultimately be the
downfall of C/C++.  Pairing the pointer with a length, as in D arrays,
is a major step in avoiding this problem.  The "extra baggage" of an
extra length field is well worth the cost -- besides, in most cases in
C/C++, you already need to pass the length with the pointer *anyway*, so
why not have the language handle it for you correctly rather than rely
on fallible humans to do the job manually, and, as the history of
security problems proves, very poorly.

The second biggest cause of memory safety problems IMO is not using a
GC. I.e., manual memory management.

Memory management is a very complicated task, and humans simply aren't
good at doing it.  I've been there and done that -- it *is* possible to
write memory-safe code with manual memory management, but it takes a lot
of time, a lot of effort, and a lot of experience, and *one* small
slip-up (among the millions conscientiously avoided by careful coding)
can cost you dearly.

It also constantly distracts the programmer from focusing on the problem
domain: everywhere you look in a non-trivial program, you need to
address memory management, and this becomes a tax that you pay at every
turn. APIs are uglified because you have to address memory management
somehow.  Libraries become gratuitously incompatible because they were
written with different memory management schemes in mind.  Your code and
design suffers because you're forced to direct so much mental effort
towards micro-managing your memory, rather than focusing on solving the
problem domain.

And the incentives are all wrong: because you have to pay memory
management tax at every turn, and because manual memory management is so
onerous, you end up preferring solutions that simplify or reduce memory
management, rather than solutions that better fit the problem domain.
For example, using strlen and copying on append / substring everywhere,
rather than a more efficient method like slicing, because keeping track
of when to free those slices will complicate your code so much (plus, it
would be incompatible with the pervasive char* interfaces of all those
libraries you depend on), that it's simply not worth the effort.  So
APIs end up being poorly designed in order to simplify memory
management, e.g., store an error message in a global (with the
associated messiness of subsequent calls overwriting previous error
messages, etc.), rather than allocating a message string, because doing
the latter would require facing tricky issues of ownership and who's
responsible for cleaning up.  Poorer algorithms end up being chosen
because they're quick and easy, memory management wise, whereas better
algorithms would make the memory management involved so complicated that
it would be a monumental effort to pull off.

And in spite of all this effort and these compromises, memory safety
problems continue to plague C/C++ codebases on a regular basis.


Pairing length with a pointer to make an array/slice, and having a GC,
are big advances in increasing memory safety of software.  They address
what I consider to be two of the top causes of memory safety problems.
Unfortunately, many folks with C/C++ background seem to be allergic to
the GC, and will undoubtedly hate me for saying that not using a GC is
one of the leading causes of their memory safety problems. But the
historical facts speak for themselves.


T

-- 
Gone Chopin. Bach in a minuet.


More information about the Digitalmars-d mailing list