The Economics of Perfect Software

Ask 100 CEOs of software companies if they want to ship software with bugs. What will they say? 50 won’t answer at all, saying something about how bugs are a huge problem in the industry that needs to be addressed; 40 will say “Of course not!” and promptly call their shark tank in preparation for a lawsuit; 9 will hang their heads and say “we can’t help it”; and that last 1 will look you straight in the eye and say “Absolutely.”

I have no idea what that last guy’s doing heading up a software company, because he studied economics.

Software can’t be written bug-free, so if you want to ship perfect software you have to fix the bugs that burrow their way into your code. (And just to head this one off at the pass: No, unit testing, agile processes, scrum, and whatever methodology du jour you may be thinking of won’t prevent all bugs from entering your code base. If I’m wrong, I’m sure you’ll tell me in the comments.)

As you’d expect, the more time and money you throw at fixing bugs, the more bugs you’ll fix. But, unfortunately, our old nemesis from economics, the Law of Diminishing Returns, applies to this process. Formally, the Law states that “the marginal production of a factor of production starts to progressively decrease as the factor is increased, in contrast to the increase that would otherwise be normally expected.” In regular-people English, that just means that how much you get out of a process isn’t the same as what you put in across the board. Instead, you end up with a quick ramp on output at the low end of input, and a long tail on output at the high end of input.

For example, imagine a program has 100 bugs, and we know it will take 100 units of effort to find and fix all 100 of those bugs. The Law of Diminishing Returns tells us that the first 40 units of effort would find the first 70 bugs, the next 30 units of effort would find the next 20 bugs, and the next 30 units of effort would find the last 10 bugs. This means that the first 70 bugs (the shallow bugs) are cheap to find and squash at only 40 / 70 = 0.571 units of work per per bug (on average). The next 20 bugs (the deep bugs) are significantly more expensive at 30 / 20 = 1.5 units of effort per bug, and the final 10 bugs (the really deep bugs) are astronomically expensive at 30 / 10 = 3 units of effort per bug. The last 10 bugs are more than 5 times more time- and capital-intensive to eliminate per bug than the first 70 bugs. In terms of effort, the difference between eliminating most bugs (say 70%-90%) and all bugs is huge, to the tune of a 2x difference in effort and cost.

And in real life it’s actually worse than that. Because you don’t know when you’ve killed the last bug — there’s no countdown sign, like we had in our example — you have to keep looking for more bugs even when they’re all dead just to make sure they’re all dead. If you really want to kill all the bugs, you have to plan for that cost too.

So killing all the bugs in a program is expensive. But let’s imagine for a minute that a software company decides to do it anyway. Software companies don’t set goals like “ship with no bugs” — they set goals like “ship on November 19th” instead — so this new goal would require changes to the company’s testing team and/or development schedule (either planned or unplanned), which in turn would imply an increase in their budget. Now, who do you imagine will pay the difference on their budget? The Company? (Heh.) If you haven’t worked in software, let me give you a hint: uh uh. The company will pass the cost on to the customer. So if you like software you can afford, I have news: you like buggy software. (And Open Source software is the same, by the way, except that instead of having to pay more and wait longer, you’d just have to wait longer. And possibly put up with more-ornery-than-normal developers.)

Now, to be clear, I’m not saying that companies should ship software with lots of big bugs. I’m saying they should ship it with a few little ones.

How do you know whether a bug is big or little? Think about who’s going to hit it, and how mad they’ll be when they do. If a user who goes through three levels of menus, opens an advanced configuration window, checks three checkboxes, and hits the ‘A’ key gets a weird error message for his trouble, that’s a little bug. It’s buried deep, and when the user hits it, he says “huh,” clicks a button, and then goes on his merry way. If your program crashes on launch for a common setup, though, that’s a big bug. Lots of people will hit it, and they will all be pissed.

Ergo, I propose the Golden Rules for Deciding When Your Software Is Ready for Prime Time. The Golden Rules state that you should keep testing your software and fixing bugs until the new bugs you find:

  1. Aren’t embarrassing to your company.
  2. Won’t tick off your customers.

The cost of fixing all the bugs in your program and then being sure you fixed them all is way too high compared to the cost of having a few users hit some bugs they won’t care about. The mindset here is not to use your customers as your testers — you’re bound to violate the golden rules if you do that — but rather to recognize that not all bugs are created equal, and some bugs justify not shipping a product while others don’t. Don’t be afraid to ship software with bugs. If you’ve got a good product that people want, a couple bugs won’t bother them at all, especially if updates to your product are easy to deploy, as they are with SaaS or a web application.

If your testing passes the Golden Rules, then your customers want your software more than they want you to fix the few little bugs that are left. So release already!

Oh, and don’t forget to ask that last CEO for stock tips. Economists always have the best portfolios.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Just leaving this comment to say you're right!

Although the Golden Rule is hardly a concept that exists in business. So here’s how I’d reframe the decision-making in terms of money:

The equilibrium point (time to ship) is here: Cost of finding all remaining bugs >= Lost money due to bugs (sales losses due to malfunctioning product and loss of confidence in company, lawsuits, etc)

Of course, these are unknowns that cannot be measured at ship-time, but you can wait until you are confident (90% for example) that you’ve reached the point. It’s all fairly subjective and dependent on the person estimating these values.

For an interesting case example where the second part of the equation is very large, look at Fishman’s ‘They Write the Right Stuff’ (1996).

Thanks! And Great Link.

I like the way you phrased that “point of equlibrium.” There is a cost associated with the bugs you leave in your program, and you know you’re done when the cost of all the bugs left in the program is just shy of the cost you would incur to fix them. I’m guessing you studied economics or business. :)

And great reference. I’m actually writing a blog about that very article as we speak. (Sort a response to this article, actually.) It’s a great article on the Space Shuttle software — software which is proof that perfect software exists — and how they wrote it.

As a sneak peek, the whole blog boils down to “List the goals for your project. Is writing perfect software your top goal? No? Then you won’t write perfect software. Is making money anywhere on that list? Yes? Then you won’t write perfect software.”

I’ll post another response here when I publish it. :)

"software can't be written bug-free"

Why not? This seems to be a tremendous myth, even though there are techniques for producing bug-free software (e.g. SPARK, model-checking) given correct requirements.

Bug-free software requires a substantial mind-shift by both technologists and managers, and I think changing managers is the bigger problem. (1) you can’t just throw bodies at the problem, you need to invest in tools, techniques and an understanding that not everyone is a good programmer/software engineer. (2) our current set of languages and tools, coupled with our “rush it out the door” culture, must change.

If we adopt bug-free implementation techniques, then we’ll need to shift the focus to better requirements/specification approaches.

Are you sure you know what a "little bug" is?

What you think is a “little bug” might be might be the only obvious symptom of a broader problem, such as a security hole, or part of a future maintenance nightmare. In my 20+ years of software development, I’ve yet to meet a manager who fails to see the value of prioritizing tasks, but I’ve met plenty who underestimate the cost of allowing software defects to accumulate.

The logic is not so simple in a hostile world...

Your Golden Rule assumes that the world stays still around your software. Which is unfortunately far from true when you are living in a world full of people who can benefit by making your software misbehave.

Just because it’s not a problem during testing doesn’t mean that it won’t be a problem after internet worms start sending maliciously crafted data at your software.

Worse yet, sometimes the flaws are built into the protocol by which multiple software applications operate so that it’s hard to ship an upgrade.

I think you need a more nuanced approach.

It depends on the software

If you are writing a piece of software from the ground up, which is very algorithmic, then I would say that it is possible to get near zero bugs. In this case you can unit test everything, because there are know paths. If your hardware stays the same for the next 30 years, then you can also be sure that new bugs won’t appear. It is also possible if there is a single governing philosophy that is enforced from design to implementation and then throughout the life span of the code.

The problems happen when things become more complex and have increasing amount of layers or external dependencies. Also, if any of the points in the previous paragraph are false, then you have also added a factor that will increase an unknown execution path. On a PC trying to get perfect software is going to be very hard: OS bugs, hardware bugs, variable set ups, malicious users, unspecified behaviour in something you depended on. Worse something you depend on the specification clearly says A, the implementation says B, the developer who left remembers doing C and somewhere along the line you are left trying to work out how to join the dots, while still trying to meet the deadline and the budget.

None of this is an excuse for bugs, especially the ones that are obvious, but it does mean that you have to tailor your expectations to the situation.

The real goal here

To steal a quote from Boris Beizer “Testing is not an act. It is a mental discipline that results in low-risk software without much testing effort”. With a focus on making software more testable from its inception, you reduce the effort required to find each bug. I guess my point is that you didn’t cover the option of having better test practices (instead of extending deadlines or throwing more money at it).

My goal as a test developer is to discover where the software doesn’t work, and to reduce the risk of sending software out into the world that doesn’t perform to an acceptable level. You’re right that this sometimes means not spending the effort to fix the little bugs, but it’s better to have a list of bugs that you are not going to fix than not knowing the bugs are there at all.

Eric Sink

Eric Sink made a similar argument a few years ago:

http://www.ericsink.com/articles/Four_Questions.html

Taxonomies and research is advancing

I will disclose that I’d rather break rocks than write code, and am not a developer. But, there is some initiatives which may be of interest and definitely on topic:

https://buildsecurityin.us-cert.gov/swa/

http://cwe.mitre.org/

http://www.geekonomicsbook.com/

-Dan Arista

not familiar with that, thanks

interesting read. thanks for the pointer!

you're right, The Golden Rules are worded a little sloppily

good catch. The Golden Rules should probably say something more like this:

Ergo, I propose the Golden Rules for Deciding When Your Software Is Ready for Prime Time. The Golden Rules state that you should test your software and keep fixing bugs until the new bugs you find: 1. Aren’t embarrassing to your company. 2. Won’t tick off your customers.

…which is slightly different. i didn’t mean you should ever stop testing your software, just that you should stop fixing the bugs you find when the bugs you find don’t matter any more.

perfect software can be written, but...

…you won’t be able to afford it.

perfect software already exists, it’s just really rare. i’ve got another blog in the works that speaks to where perfect software comes from, and what it takes to write perfect software. as a sneak peek, it all boils down to: (1) if perfect software isn’t priority number one, your software won’t be perfect; (2) if making money (or promoting yourself, or…) is anywhere on your list of goals, your software won’t be perfect.

as long as software is written by people, it won’t be perfect. this is largely a matter of practice, as opposed to theory. the people writing the software will always ship it before it’s absolutely perfect because (1) they need or want to get paid, or (2) they want to move onto the next thing.

or, they won’t ship it at all.

The Economics of non-perfect software

If you’re developing a game of other kind of software that are not directly linked to material consequences, then your argument may be valid. However, try using this same argument in a room full of lawyers from a hospital, air carrier, or central bank that were using your software, and because of a know “minor” bug they incurred in significant losses. You and your company would go out of business faster than you could finish explaining ourself.

obviously it depends on what you're writing

you’re right — some software is way too important for this kind of hand waving. i’ve actually got a blog in the works about how perfect software gets written, and why. you’ve given some good examples for that article. thanks!

but that there are exceptions doesn’t mean these rules aren’t useful. for most software, this kind of stuff is fine. especially if deploying changes is easy, as i say in the article.

remember, even Newtonian Mechanics don’t work in all cases. this doesn’t mean they’re not useful, though. (to be clear, i’m not claiming these rules are anywhere near as important as Newtonian Mechanics! just giving the first example that came to mind.)

Rules

I think that the golden rules, whilst simplistic are supposed to be distilled down to that level. That isnt to say that that we can make everything black and white, off and on, true or false; it is to make sure that the tone isnt altered by giving examples that are perhaps too broad and lose the essence of the point being made.

Software cannot be perfect - economics or no economics

The fundamental issue with perfect software does not lie in economics, though it might be the predominant reason why we ship imperfect software.

Theoretically, the only way to have 100% defect free software is to test all possible workflows and combinations. And that would be actually impossible!!

Just imagine testing windows calculator. One would need to check every imaginable addition, multiplication, division, etc. and even if there were an automated test script that ran a million operations per second, it would still take millions of years to completely test every input and determine the software is bug free. Don’t ask what that would do to the cost of windows calc!!

That is why even NASA which spends 100 or 1000 times more effort and money in testing their code cannot guarantee that their code does not have any defects. It can only potentially only have no known defects! Not because they dont have the money or the time. But it is technically impossible to check the trillions (or more) of possible scenarios.

A combination of unit testing, code reviews (which BTW, catch the maximum defects), functional and non functional testing can minimize defects. This is true even for life or mission critical applications.

Same rules

Surely the same rules still apply. It’s just that when you’re coding software for an aircraft carrier or a hospital, your customers will become ticked off at smaller bugs than your typical transaction-processing business.

Thanks...

…for sending us back to the Stone Age of software development.

No, unit testing, agile processes, scrum, and whatever methodology du jour you may be thinking of won’t prevent all bugs from entering your code base.”

I agree with that statement, but do so from almost 10 years of experience with Agile processes, and particularly the practice of automated unit testing. While I have yet to see systems that are defect free, I have seen one and even two orders of magnitude improvement when developers take responsibility for their code and start to write automated tests. That allows the people whose responsibility it is to test the functionality of a product to do just that - test the functionality. No, systems written this way aren’t defect-free, but they’re a hell of a lot better than most systems written without automated tests.

You also didn’t mention code that’s written Test First. Applying automated unit tests to code after the fact is indeed difficult and expensive. My experience with working Test First is that while you are writing more code (the tests), you’re saving huge amounts of time downstream debugging. Code written this way is also inherently more testable, and always much simpler - I can tell in a couple of seconds if code has ben written with or without automated unit tests.

While I agree with you that it’s not economically prudent to attempt to drive 100% of the defects out of code, I would ask that you choose your words more carefully. Your statement above essentially gives developers who don’t use automated unit tests a license to continue shipping defect-riddled code.

Dave Rooney
Westboro Systems
http://www.westborosystems.com

Proving software free of defects is intractable problem

software can’t be written bug-free” Submitted by Anonymous on Sun, 03/28/2010 - 18:35. Why not? This seems to be a tremendous myth…

It’s not a myth, it is a fundamental property of computer science. It stems from Alan Turing’s demonstration of in 1936 that the Halting problem is undecidable. http://en.wikipedia.org/wiki/Halting_problem

Note: Undecidable doesn’t mean it’s hard, it means it can’t be done. Even enumerating all possible paths through a piece of software is undecidable. So, proof of the kind: “I have tested all possible paths in my software” is impossible.

You can improve the situation by adopting formal methods and indeed write many tests. But the guarantee that a piece of software has no defect is for CS, the equivalent of breaking the Speed of Light for Physics.

Interesting article

Another interesting article is “Why we all sell code with bugs”, at this link:

http://www.guardian.co.uk/technology/2006/may/25/insideit.guardianweeklytechnologysection

ciao

The article was about fixing bugs, not preventing them...

I didn’t see anywhere where the article suggests that developing software using bug prevention pracitices was acceptable.

I think the point was that he thought people were going to respond to the article with comments such as, “If you use unit tests, agile processes, scrum… you won’t have bugs in your code in the first place.” So, he included the quote as a pre-emptive response that while those methods are extremely beneficial, they will not eliminate ALL bugs from being written in the first place as much as some people might like to claim the would.

At no point did I read that he indicated it was ‘okay to not use automated unit tests’. The article is simply discussing software after it has been written, whether or not unit tests were used.

re: The article was about fixing bugs, not preventing them...

Automated unit testing as a bug prevention practice? OK, I can certainly accept that.

At no point did I read that he indicated it was ‘okay to not use automated unit tests’.”

I’ve been coaching teams in Agile practices for 8 years. Trying to write automated tests against existing code is hard, and I’ve watched developers simply give up because they don’t see that it’s an investment… a get rich slow scheme. If a developer hasn’t written automated tests before, do you think that if they read Doug’s article then they’re going to start now?

Here’s a guy (Doug) who really knows his sh*t… and he’s saying that automated tests aren’t going to find all the bugs. So, to hell with spending time on automated tests - I’m swamped as it is!”

I’ve heard that last sentence a couple of dozen times over the years. Automated testing is the first baby step towards dealing with the steaming piles of software that are running this world. Let’s not give any more reasons for developers to avoid automated testing.

Dave Rooney
Westboro Systems
http://www.westborosystems.com

Point of View

The claims make sense when you view bug-fixing as a separate process, divorced from the process of creating software. As anonymous pointed out, significant gains are to be had by making testability part of the design process. For most software departments, it is a radical change in perspective to be thinking at the design stage “how the heck will we prove to ourselves this software works nearly perfectly?”.

The current fad of [insert favorite extreme testing abbreviation] makes software better — mainly because of the fact that otherwise those programmers were going to do little or no testing (also because some bugs are found earlier, avoiding the economic multiplier of late debugging costs). But extreme testing also plays into your thesis, since the costs are proportional to the size of the code, and the bugs that are left remain difficult to find.

But the really interesting part of your thesis to me is two points: why are those last bugs so hard to find, and how should bugs be prioritized. For me, the solution to both is to play The Assertion Game. The basic rules are simple: bugs are fixed only by adding assertions until one gets a failed assert instead of the bug. No debuggers allowed. And all assertions ship to the customer, along with a framework for reporting them back to you.

What happens when you do this? Probably I’m about the only person who’s ever done this, so I’ll testify. First, you end up with more assertions in your code than you have ever seen before, and local readability greatly increases. Second, since every execution of your code is performing an ever-growing number of “tests”, you catch more bugs sooner — and all “tests” are run on every computer configuration you ever run on, all the time. Third, you invert statistics in your favor. Normally, statistics works against us. If you have 100 functions that are each 99% likely to execute with no bugs, the odds that they’ll all execute with no bugs is about 37%. But with The Assertion Game, once you exceed an assertion density that most programmers have never seen, you’re putting those depressing statistics to good use, because you WANT things to fail — with an assertion. The final advantage is that the customers themselves can automatically dictate the prioritization of fixing shipped bugs, since you get back the assertion reports. Sufficient assertion density means that, instead of non-reproducible debugging nightmares, you usually get back a crisp indication of the point of failure that makes debugging trivial. Automated assertion reports are best, because customers often do not report bugs, leaving developers with an inflated sense of quality (and scratching their heads over why sales keep going down).

The psychological advantage of The Assertion Game is that the prevention of bugs is driven by the code itself. You don’t HAVE to add any assertions — so long as you never have any bugs. IME, once you achieve a critical density of assertions, such that it becomes a shocking event when something goes wrong without tripping an assertion, the benefits become obvious and further drive the process. So, where Stroustrup says it is usually unnecessary and wasteful to assert that a pointer is not NULL, I end up saying “Boy, I’ve seen so many bugs end up with a NULL pointer, I think I’ll just assert it right here from the get-go.” The fact that you can only debug by adding assertions also causes a significant mental shift. Most programmers think that their code is readable. But once you have to stare at that code and decide where to put an assertion to trap a particular bug, your standards for readability transform. “What can I say for certain is true at this point in the code?” is very different from a vague and untested “Sure, I know what that code does.”

You've misunderstood the nature of the Halting problem.

The Halting Problem is indeed undecidable, but you seem to be misunderstanding what is meant by the term “undecidable”. It doesn’t mean we can never answer this question, it means we can never algorithmically solve this question for all possible inputs.

You can in fact write provably correct (which implies bug-free) software, but it is extremely difficult to do so for non-trivial programs. For example, a trivial program that computes x+y for bounded values of x and y on a very simple microcontroller can be proven to be correct. The problems start appearing when you realize just how low the bar for “triviality” is set. For example, a typical “hello world” program is probably not trivially provable, because the underlying platform that you generally take for granted (which separately includes the OS, drivers, seemingly unrelated user programs running “concurrently”, and various components of the hardware itself that make up a modern computer, which is why I specified a “simple microcontroller” for the previous example.) has not been proven to be correct, also the compiler you use has also not been proven to be correct. All of these components must be provably correct, which means the overhead of a “perfect program” lies not just in YOUR application, but also in the entire hardware/software stack that it is based on.

So, practically speaking, you cannot create perfect software, but with a massive amount of engineering effort, you could theoretically build a system with well-defined constraints that does deliver provable software features.

Alternately, you just redefine the term “perfect software” to your desired level of rigor, and call it a day. You don’t think any other area of human endeavor is provably correct, do you? If anything, programming is one area where it is more likely than anything else.

i'd like to subscribe to your newsletter

do you blog? i’d be interested in seeing you write about this with a little more detail. it’s an interesting idea, and i’d like to see what you do with it. post back a comment here with a link, and i’ll read it.

if you don’t, i just might. :D

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.