I get really frustrated with posts that claim your unit tests lie to you or 100% code coverage is fallacious or there are flaws in coverage measurement. These are sensationalist headlines that encourage bad behavior, by confusing new or inexperienced or argumentative or lazy developers: "well, we all know test coverage isn't that important..."
There is a fundamental confusion here between necessary and sufficient. In plain English, things that are necessary are, well, necessary for a process; things that are sufficient are, collectively, sufficient.
One does not imply the other. Tires are necessary but not sufficient for a car.
Why do I bring this up? Because near-100% code coverage is basically necessary for a complete automated test suite, but not sufficient.
If you do not have near-100% code coverage, then some portion of your code is not being executed by your tests. That means that code is untested. It can't possibly be tested if you aren't executing it!
(As a corollary, no test that adds to code coverage is useless. It may not be as useful as it could be, but it is still running code that is otherwise untested. Room for improvement does not equal uselessness!)
OTOH, there are many reasons that 100% code coverage isn't sufficient to make your test suite good. For one, code coverage isn't branch coverage. For another, code coverage can't tell you if your feature set meets your client's needs or is in fact functional -- all it can do is tell you that your feature set meets your expectations at the moment.
So, let me suggest this simple pair of statements:
- High (statement) code coverage in your tests is important.
- High code coverage doesn't guarantee that your tests are complete.
I feel like high code coverage is actually a pretty low bar for a project. Whether or not you're using TDD, if you don't have high code coverage numbers, you're not really testing very thoroughly; it's generally pretty easy to amp up your code coverage numbers quickly, although writing the test fixtures can be a problem. I'll talk a bit about this in my PyCon talk.
Regardless of your tests suite, your project will probably fail in a boring and project-specific way.
Posted by Paddy3118 on 2009-02-19 at 06:56.
Is their ever a reason to **defend** non 100% statement coverage? The only psudo defence I can think of is that those missing statements are tested in a way that does not give results that can be collated in the same way. - Paddy.
Posted by Titus Brown on 2009-02-19 at 10:11.
Paddy, in practice I've found one or two situations. Multi- platform coverage reports can be difficult to integrate -- e.g. one request I've had from Python-Dev is to provide an integrated view of code coverage from Windows, Mac OS X, and Linux. But that's just a tools question. The more real concern is that sometimes there are things that are **very** hard to test any way but manually. Again, this can be thought of as a limitation of the test framework technology... but (for example) an aviation company writing code to send data to a flight box might not be able to test that interaction easily without hacking hardware. Yes, you can mock the interaction -- but then you're not doing proper integration testing that includes the hardware, are you... So in practice I usually suggest > 90% as a good baseline target. Past that I find that increasing code coverage yields lower returns than regression tests or other such things. YMMV; I mostly work on FLOSS projects, so I (frankly) have a slightly lower commitment to 100% quality than I think a company should have ;) --titus
Posted by manuelg on 2009-02-19 at 13:53.
Titus: > ... So in practice I usually suggest > 90% as a good baseline target. Past that I find that increasing code coverage yields lower returns than regression tests or other such things. This is an excellent point. This comment, plus your post make an excellent starting point for understanding where test coverage fits into sound software engineering.
Posted by Michael Foord on 2009-02-19 at 16:31.
Too right. Too often "testing is no silver bullet" is used as an excuse not to test.
Posted by Andrew Dalke on 2009-02-20 at 06:10.
I had a discussion about this recently with Glyph. He posted that 100% coverage was essential. My response and various followups at <a href="http://glyph.twistedmatrix.com/2009/02/joel-un- test.html">http://glyph.twistedmatrix.com/2009/02/joel-un- test.html</a> . Part of it is my assertion that coverage is not part of TDD, because of how refactoring works. Refactored code only needs to pass the suite, and code coverage isn't typically discussed as part of this step. @Paddy: my defense for non-100% coverage by the test suite was that manual testing covered the 1% of the cases I couldn't easily test without a day or two of extra programming. Manual testing takes about 15 minutes. As it was, 99% coverage took 1 day over a project lifetime of 9 days. (It's ongoing, so my numbers are larger than I mentioned in Glyph's comments.) I also give an example of auto-generated code, like a decision tree implemented in C. Failure in the code generation are likely correlated with failures in the test code generation. Of course you could hide things in a state table, but you still won't cover all of state space, which is the real goal. SQLite strives for 100% coverage. That task, and the story behind it, is something I would like to hear more about. Even with tool development specifically meant to support that, they haven't yet achieved it. They are at 99.37% statement coverage and 95.16% branch coverage, says <a href="http://www.sqlite.org/testing.html">http://www .sqlite.org/testing.html</a> . Their consortium agreement at <a hre f="http://www.sqlite.org/consortium_agreement-20071201.html">http://ww w.sqlite.org/consortium_agreement-20071201.html</a> calls for 95% coverage. Try that clause in your next collaboration agreement. ;)
Posted by Chris Perkins on 2009-02-20 at 13:05.
I could not agree with your sentiments more. I have had the argument with other developers so many times whether or not coverage is important. I can tell you that I have worked on projects that had no notion of coverage, and projects that did. The projects with coverage were easier to maintain, were easier to refactor and were more robust in their functionality. I can't count the number of times I find bugs while finishing that last 10% of coverage. Here are my recommendations for coverage percentages: 85% bottom bar for heavy- handed development (read: sprinting) 90% for alpha-quality code. 95% beta release is practical. 100% formal release. Let's face it, TDD is only practical to some extent, so you are going to spend a certain amount of time writing tests after you have something "working". If you look at this time as part of the maintenance of the project, and set goals for release that include coverage gateways, your product will be more stable. I use this principle for www.sprox.org, and it has served me well, very few bug reports have come in. I agree absolutely that branch coverage is more important than coverage, but that is not to say that coverage should be ignored. Coverage provides a nice metric and a good starting point. One question I have is: Does Python provide a good branch coverage tool? cheers. -chris
Posted by Andrew Dalke on 2009-02-20 at 14:13.
Can anyone point me to a non-trivial package (10K SLOC or more) which has 100% code coverage in the unit tests? And which counts as having a "formal release"? I hear the advocacy for it, but I again point to SQLite which has put a lot of work getting to **nearly** 100%, and hopes to get the rest of the way some time this year. I want to know how I should test for malloc/new failures, how to test for network hiccups, how to handle failures with the file system. The only thing I can come up with is having mock objects for everything, or a shared library shim, which can insert failures in the right locations. But writing full isolation layer for anything involving the outside world seems like a lot of work, and fragile in the face of code changes. I want to see what people have done for this. @Chris? Is any of your code available for me to look at?
Posted by Ned Batchelder on 2009-02-20 at 20:57.
Hey! Why did I get lumped in with the Eeyores? I started my post with "Coverage testing is a great way to find out what parts of your code are not tested by your test suite." I was just trying to point out the same thing you are: even with 100% coverage, you have a lot to learn about what your code does.
Posted by Titus Brown on 2009-02-20 at 23:46.
Andrew, beautiful discussion with Glyph. Thanks for the pointer! The sqlite stuff is also awesome and something I hadn't seen before. Ned, I object to your headline, not your article. I **know** you're into coverage ;) --titus