I am a strong believer in automated tests. They can take longer to build the first time, but you can run them over and over again with no manual effort after that. In this modern age I don’t think I need to make the case for the importance of tests.
But what makes a test a “good” test? A common mistake I see is people put a lot of thought and design into the program being tested, but not much thought into the test case itself. Here are some aspects of what I think makes a test case “good”.
-
Test cases should test one thing, and MUST NOT test other things that the test case is not for. It is a very common mistake for a single test case to test lots of aspects at the same time. This is generally bad. This is my #1 issue with many tests. Why?
- A goal of test cases is to allow parts of the code base to be changed with confidence. If you break something, a test case will catch it. We want to make changing the system safely faster. If you have lots of test cases testing the same thing, if you ever change that part of the system you will have lots of test cases to change. This will slow down code development. It will turn people off automated testing. It is a very serious problem.
- A common test approach is to feed “golden set data” in (data that never changes) and check the output results. This approach is useful. What you need to do however is to put the effort in to ONLY check the parts of the results relevant for the test case.
- When you develop a test you have a clear vision of what you are testing for. This should be documented with the test case code or else over time the test case may be modified and the original purpose or edge case lost.
Test cases should try to make it clear why the test failed. The more context available the better. This could be as simple as making sure a JUnit test case throws an assertion with lots of state in local variables so the debugger can easily display state if the test does fail. Or having finer grain messages with specific tests saying things like “the value X was Y but expected to be Z. This may indicate…” – it’s the last “this may indicate” that is often missed. If someone other than the test case author triggers the failure, they need to be able to work out what went wrong.
- Test cases should always clean up, leaving everything in the state it was before the test run. This would include things like cleaning up temporary files.
- Test cases should be layered so you can run “fast” test cases without having to run slower ones. The definitions of unit, integration, and system tests may in practice be less important than having multiple layers of tests that take different durations to run. For example, it may be the tests are more “run every compile”, “run before committing source code changes”, and “run before deployment”.
- Test cases should not depend on the order in which test cases are run. This allows tests to be run in any order, individually or in groups, etc. The only exception is when the test environment takes so long to set up that you cannot afford to obey this rule.
- Test cases should be able to run in parallel whenever possible. The cost of supporting this can get prohibitive in some cases, but it is a goal nonetheless. For example, when dealing with sockets, you need to have a way of two concurrent tests not using the same socket, or else make sure those tests never run in parallel.
- Treat test case code like production code. Run style checks over it. Run CPD (copy paste detection) reports over it. Refactor it. Just like you design the real code to be clean and easy to restructure, make your test cases the same.
Doing a web search you can find lots of other articles around. I found a very short one here: http://www.softwaretestingmentor.com/tptc/writing-good-test-cases.php. I include this link because the site has lots of useful terminology of different test types etc. Nothing amazing, but seemed to have good coverage of concepts.
Note that my list if “goodness” is applicable to the different types of tests.
- Unit tests test a very small unit of code such as a class. If a class A has to call other classes B and C, the test case for A should do its best NOT to check correctness of B or C. The test cases for B and C should have checked those classes are behaving correctly. There is gray boundaries here of course. The important thing is to keep in mind the test case for A is for A – not for A, B, and C. It does include the interaction between A and B, just not the logic in B.
- Integration tests worry about multiple units working together. What is the difference between unit and integration tests? Frankly there are really multiple levels going on here. A unit test tries to be very specific. If a unit test fails you should have a laser beam pointing exactly at what broke and why. An integration test deals with several units. But in a large system you could have component tests that test say interactions between executables possible running on different machines.
- System tests worry about the whole system you are building and making sure it does what the user expects. Don’t get hung on up terminology here however. If you automated your system tests then really they are just an integration test at a larger scale. But even system level tests should aim to test specific aspects independently. Where rules start to bend is because the setup time of an environment to perform a system test on can start to cost significant time, you may break the “I can run tests in any order” and “allow them to run in parallel” rules. Instead it is common to make trade-offs to have tests run in a particular order, to avoid the need to reset the environment ready for the next test case.
Note I have left “acceptance tests” off the above list. To me terms like “acceptance” and “regression” tests talk about the reason for a test case, not the way it is built.
Final thoughts: worry about your test code quality, keep test cases focused so minimal number of tests break when one part of the system breaks, and think about what suites of tests you can run when. Terminology like unit, integration, system, acceptance, regression, etc can be useful to think about things, but when tests run and how may actually be more important.