Why Bloat Is Still Software’s Biggest Vulnerability

Mon Feb 12 17:30:23 UTC 2024

On Mon, Feb 12, 2024 at 03:55:50PM +0000, Paolo Invernizzi via Digitalmars-d wrote:
> On Monday, 12 February 2024 at 15:03:01 UTC, tim wrote:
> > On Monday, 12 February 2024 at 14:49:02 UTC, tim wrote:
> > > I thought I would get a discussion started on software bloat.
> > > 
> > > Maybe D can be part of the solution to this problem?

No amount of D innovation is going to stop programmers infected with the
madness of dynamic remote dependencies that pull in an arbitrary number
of external modules. Potentially a different set of them every time you
build.  Tools like cargo or dub actively encourage this model of
software development.

Which is utterly crazy, if you think about it. Unless you pin every
dependency to exact versions (who even does that?!), every time you
build your code you're potentially getting a (subtly) different set of
dependencies. That means the program you've been trying to debug 5 mins
ago may not even be the same program you're debugging now.  Now of
course it's possible to turn off this behaviour while debugging, but
still, the fact that that's the default behaviour is just nuts.

Over the long term, this means that you cannot reliably reproduce older
versions of your software -- because the versions of dependencies
version 1.0 depended on may not even exist anymore, now that your
program is at version 2.0.  If your customer reports a problem, you have
no way of debugging it; you can't even reproduce the exact image your
customer is running anymore, let alone make any fixes to it. The only
thing left to do is to tell them "just upgrade to the latest version".
Which is the kind of insanity that's familiar to everyone of us these
days.  Nevermind the fallacy that "newer == better". Especially not in
the current atmosphere of software development, where so-called "patch"
releases are not patch releases at all, but full-featured new releases
complete with full-fledged new, untested features (because why waste
resources making a patch release + a separate new feature release, when
you can just bundle the two together, save development costs, and give
Marketing all the more excuse to push new features onto customers and
thereby making more money).  The number of bugs introduced with each
"patch" release may well exceed the number of bugs fixed.

All this not even to mention the insanity that sometimes specifying just
*one* dependency will pull in tens or even hundreds of recursive
dependencies. A hello world program depends on a standard I/O package,
which in turn depends on a date-formatting package, which in turn
depends on the locales package, which in turn depends on the internet
timeserver client package, which depends on the crytography package, ad
nauseaum.  And so it takes a totally insane amount of packages just to
print Hello World on the screen.

Not to mention the whole concept of depending on some 3rd party code
that exists on some remote server somewhere out there on the wild wild
west (www) of the 'net is just crazy.  The article linked below alludes
to obsolete NPM / Node packages being taken over by malicious actors in
order to inject malicious code into unwitting software.  There's also
the problem that your code is not compilable if for whatever reason you
lost network connectivity. Which means if you suddenly find yourself in
an emergency and have to make a small fix to your program, you won't be
able to recompile it. Good luck.

> > https://spectrum.ieee.org/lean-software-development
> 
> Agreed .. two days ago I needed to pull a 13GB docker image from
> Nvidia repository ... a totally out of control mess.
[...]

Reducing code size is, to paraphrase Walter, to plug one hole in a
cheese grater. There are so other many things wrong with the present
state of software that code size doesn't even begin to address.

Today's web app scene is exemplary of the insanity in software
development. It takes GBs of memory and multicore GHz CPUs to run a
ridiculously complex web browser in order to be able to run some bloated
web app with tons of external dependencies at the *same speed* as an
equivalent lean native program in the 80's used to run on 64 KB of
memory and a 16 kHz single-core CPU.  What's wrong with the picture
here?

And don't even get me started on the IoT scene, which is a
mind-bogglingly insane concept in and of itself. Why does my toaster
need to run a million LoC operating system sporting an *internet
connection*?!  Or indeed, a *stuffed animal toy* that some well-meaning
parent give my son as a "gift", that has a built-in internet interface
that can be used for downloading audio clips (it's cute, it downloaded a
clip of my son's name so that the toy could address him by name -- WHY
OH WHY... argh).  I betcha said OS running on this thing has not been
updated (and isn't ever going to be) for at least 5 years, and carries
who knows how many unpatched security vulnerabilities. I wouldn't be
surprised if a good chunk of today's botnets consist of exploited
household appliances running far too much more software than they
actually require for their primary operations. Perhaps this
internet-"enabled" stuffed animal is among the esteemed members of such
a botnet. (Thankfully the battery has run out since -- and I'm not
planning to replace it, ever. Sorry, botnet.)  These are just milder
examples of the IoT madness.  Don't get me started on internet-enabled
webcams that can be (and have been) used for far more nefarious purposes
than running some script kiddie's botnet.

Years ago, if somebody had told me that some random car driving by the
house could hack into my babycam and make it emit a scary noise to scare
the baby, I'd have laughed them out of the house as some delusive
paranoid.  Unfortunately, today this is actual reality, no thanks to
insecure misconfigured WiFi routers whose OS haven't been updated in
eons and household appliances having internet access that they have no
business to. 

In principle, the same thing applies to Docker images that contain far
more stuff than they rightly should.  No thanks to these non-solutions
to security issues, nowadays it's no longer enough to keep up with your
OS's security patches, because patching the host OS does not patch the
OSes bundled with each Docker image. And for many applications, nobody's
gonna patch their Docker images (the whole reason they went the route of
Docker is because they can't be bothered with actual, proper integration
with their host OS, they just want to target a static known OS that
works for their broken code, and therefore have zero incentive to make
any changes at all now that their code works).  So your host OS may very
well be completely patched, but thanks to these needlessly bloated
Docker images your PC still has as many security holes as a cheese
grater.

//

And there's the totally insane concept of running arbitrary code from
unknown, untrusted online sources. Javascript, ActiveX, scripting in
emails, in documents, etc..  Eye-candy for the customer, completely
unnecessary functionally-speaking, and an absolute catastrophe
security-wise. The entire concept is flawed to begin with, and things
like sandboxing, etc., are merely afterthoughts, bandages that don't
actually fix the festering wound underneath.  Sooner or later something
will give.  And the past 20 or so years of internet history proves this
over and over again, to this very day.  But in spite of the countless
arbitrary-code execution vulnerabilities, nobody is ready to tackle the
root of the problem: 3rd party code from unknown, untrusted online
sources have NO BUSINESS running on my PC. But almost every major
application these days are literally dying in their eagerness to run
such code -- by default. Your browser, your email reader, your word
processor, your spreadsheet app, just about everything, really, just
can't wait to get their hands on some fresh unknown 3rd party code in
order to run it at the user's expense.

And the usual anemic response when a major exploit happens shows that
what the security community is doing -- all they can do given the
circumstances, really -- is, to quote Walter again, merely plugging
individual holes in a cheese grater.

//

The underlying problem is that the incentives in software development
are all wrong these days. Instead of incentivising code quality,
security, and conservation of resources, the primary incentive is money.
I.e., ship software as early as possible in order to beat your
competitors, which in practice means do as little work as you can
possibly get away with in order to get the product out the door. Code
quality is a secondary concern (we're gonna throw it all out by next
release anyway), conservation of resources is a non-issue (resources are
cheap, just tell the customer to buy the latest and greatest hardware,
our hardware partners will give us a kick-back for the free promotion),
and security isn't even on the list.  Developing software the "right"
way is not profitable; questionable practices like importing millions of
LoC from dynamic remote dependencies get the job done faster and leads
to more profit, therefore that's what people will do.

And of course, this state of incentives is good for big companies that
are making huge profits off it, so they're not going to let things
change for the better as long as they have a say in it. And they're the
ones that are employing and paying programmers to produce this trash, so
anyone who doesn't agree with them won't last very long in this career.
Therefore guess what kind of code the majority of programmers are
producing every day.  Definitely not lean, security-conscious code.

As someone once joked, the most profitable software venture is a
business of two departments: virus writers and anti-virus development.
Welcome to software development hell.

T

-- 
Life is complex. It consists of real and imaginary parts. -- YHL