Which came first, the bug or the feature?

We live in a world of software, and every day humanity becomes more dependent on it. This creates a never-ending problem; new versions of software introduce new features (you can do this now!), but new features introduce new bugs. The worst is when the bug breaks something that used to work, and you didn't even want that new feature anyway! Even worse, sometimes the bug is subtle, and takes a while to find, and surreptitiously ruins a bunch of work in the meantime. So what can we do to fix this problem? It's time for a new paradigm.

Does old = stable?

My laptop's operating system is CentOS 7, a version of Linux based on Red Hat Enterprise Linux, which is itself a notoriously stable version of Linux designed for large-scale commercial use (think server room). The virtue of CentOS is stability. It is supposed to have all the bugs worked out. So unlike Fedora, my previous OS, it shouldn't randomly crash on me.

A downside to this stability is that many of the free programs distributed with CentOS are meant to be extra stable, which means they are generally a few years old. So even though the GNU Compiler Collection (gcc) is up to version 7.2, I only have version 4.8. And that's fine, I don't need the new features available in gcc 7.2.

Another program I frequently use is a mathematical plotting program called gnuplot (not associated with GNU). gnuplot makes superb mathematical figures (lines and scatter plots) and I use it for all of my physics papers. Because CentOS 7 is stable, it comes with gnuplot 4.6.2 even though gnuplot's official "stable" version is 5.2. Again, I'm willing to sacrifice features for stability, so I don't mind missing out on all the hot new stuff in 5.2

But this model — using older versions because they're more stable — doesn't always work. Because there's a bug in gnuplot 4.6.2. The details aren't important; it's a rather small bug, and it's already been fixed in future versions. Nonetheless, the bug is there, and it's annoying.

This brings me to my observation. Software comes in versions (v 4), and sub-versions (v 4.6), and sub-sub-versions (4.6.2). The idea is that big new features require new versions, lesser features sub-versions, and bug fixes only need a new sub-sub-versions. But this doesn't always hold true, and it's a difficult system to manage.

In this system, CentOS updates have to manually added to the CentOS repositories. New versions can't automatically be added, because they might add new features. How do the CentOS developers know that 4.6.3 only fixes bugs, and doesn't add any features? They have to manually read the release notes of every supported program and manually add sub-sub-versions which only fix bugs. This is probably why CentOS is still using gnuplot 4.6.2 when it should be using 4.6.6 (the most recent patchlevel).

Furthermore, perhaps I as a user only want one new feature from 5.1, and could care less about all the rest. And what if it's the features I don't want that introduce the bugs? In this scenario, I upgrade to 5.1, which is mostly a bunch of stuff I don't want, and get a bunch of bugs I definitely didn't want.

New Features and Bug Fixes

This brings me to the new paradigm I wish to endorse. This idea is not new to me, and I probably wasn't the first human to have it, but I don't have time to scour the internet to give credit. Which means you don't have to give me any.

Features should be kept separate from bug fixes. Since new features depend on old features (feature XYZ depends on feature XY), features will have to exist in a dependency tree, with a clear hierarchy ascending all the way back to the minimal working version of the program (the original feature). With the tree in place, it should be possible to associate every bug with only one feature — the feature that broke the code when it was implemented. So even if a bug is only discovered when you add feature XYZ, if the bug was introduced in feature XY (and has been silently wreaking havoc), you can fix it in feature XY and apply it to all versions of the program using feature XY.

This paradigm is similar to the current model (versions, sub-versions), but notably different in that I can pick and choose which features I want. The dependency tree takes care of getting the necessary code (and making sure everything plays nice). With this paradigm, I only subject myself to the bugs introduced by the features I actually want. And when bugs are discovered in those features, I can automatically patch the bugs without worrying that the patch introduces new features which introduce their own subtle bugs.

Managing the dependency tree sounds like a bit of a nightmare, at least until someone changes git to easily follow this paradigm. At that point, I would say that the whole thing could become pretty automatic. And that way, if gnuplot pushes some bug fixes to some old features, CentOS repositories can automatically snatch them up for my next CentOS update. No manual intervention needed, and this annoying gnuplot bug doesn't persist for months or years.

Of course, in proposing this paradigm I am committing the ultimate sin; declaring that things should work some way, but with no intention of implementing it. But what can I do? I'm a physicist, not a computer scientist. So the best I can do is yak and hope the right person is listening.

Comment

$\setCounter{0}$