One of the questions programmers often ask (and are often asked) is, “Are we done yet?” It’s often hard to know: unlike spaghetti, you can’t just throw software against the wall to see if it sticks. “When we run out of time” and “when we run out of money” are common answers in the real world, but there are better ways, and this page shows one of them. Each row represents a module in Firefox; the numbers show what percentage of the lines and functions in that module are exercised by tests. The goal definitely isn’t to get 100% coverage of every module—that’s usually impossible, almost always uneconomic, and still doesn’t guarantee that the program will work [1]. It’s also misleading to think that high coverage automatically means high reliability [2]. What it does do is draw attention to places where we just don’t know what the quality of the code is. As a bonus, if coverage from past test runs is stored somewhere, tables like these can tell us if the percentage of tested code in a module has suddenly gone down, i.e., if someone has added several hundred lines of new functionality without adding corresponding tests.

So why aren’t coverage stats like these taught in first year programming courses and expected afterward? Setting up a server to run tests automatically every hour and refresh a page like the example is a bit of work, but IDEs like Eclipse can equally well measure and report coverage on the developer’s desktop.  As with many other working practices, I think there are two reasons:

  1. Inertia: we didn’t do it twenty years ago, so today’s teachers don’t think it’s a must-have, so they don’t teach it to today’s students, and around and around we go.
  2. Time pressure: every undergraduate curriculum (not just that of computer science) is already overcrowded. As simple as this idea is, adding it would mean less time for something else, and no one can agree on what “something else” to take out.

[1] See Wikipedia’s discussion of code coverage and path coverage.

[2] Particularly for concurrent code—even if every individual path works correctly, the interactions between multiple processes or threads can cause race conditions, deadlocks, and partial failures.