Saturday, 4 February 2017

All our tests should always pass

Continuous Integration and automation enforce that our tests should be of a very high quality and always pass.  I don't think I always agree.  Tests that regularly pass could be concealing quality problems that could damage your product.

In any CI pipeline new builds are tested with a set of automated testing.  Tools such as chef or jenkins are very good at orchestrating this process.  However they do rely on a set of tests that are almost 'guaranteed' to run.  However if the same tests are always used and they always pass you must ask yourself the following questions.

  1. Are the tests doing what you think that they are.  Are they testing the code you think they are, are they just exercising the code and not testing it?  Beware tests that don't test what you think they do
  2. Are the tests and the newly delivered code in different areas of the product.  If your tests are regression testing one part of the product which is logically separated from the code you are delivering then you might not be testing as fully as possible.
  3. Beware the pesticide paradox.  If the same tests have been used for a long time then it is likely that the code has become 'immune' to the tests.  Areas of the code that are actively tested by the tests are likely to have become so well written over time that a developer is unlikely to introduce an error into this area.
  4. The tests have become stale.  The tests although they exercise the code that is being developed they may not do it in ways that you would expect your customers to use the product.  Maybe they use a legacy way of invoking the product or they use methods that have long fallen out of favour.
Tests that fall into any of the above categories might be lulling you into a false sense of security.  Sure they pass but do they do what you think that they do.   To help keep your pipeline of high quality I have the following recommendations:
  1. Rotate your CI tests.  At different stages of your testing you will use different tests.  Rotating the tests to different stages in the pipeline helps to drive the code differently at different stages of integration and could find some easy to spot defects
  2. Calculate a test efficiency rating by dividing the number of regressions the test has found by the number of times it has been executed.  Tests that are inefficient might be incorrectly testing the code or be testing an area so stabilised that it is unlikely to be regressed.  It might be worth only running this test occasionally
  3. Constantly add new tests into the pipeline.  Writing a new test for an existing part of the product can be a great learning exercise and pull out new defects.
In summary - although you need code to progress through your CI pipeline it is good and almost healthy to find regressions as a regular activity.  If your tests always pass don't become smug, they could be just not telling you that there is a problem

Beware tests that don't test what you think they do

Migrating tests from one framework to another we came across a test suite that taught us a valuable lesson, treat your automation with a level of distrust, unless you have valid reasons or proof that it is doing what you expect.

This test suite was seemingly perfect, it was reliable, ran against all the supported releases and had a run time of 5 minutes.  It's name was simply the name of the technology that it tested.  The only problem was that if it was going to be used as part of our CI pipeline then it would need to migrated to our new test framework.

Most of the components of the test were easily reusable, all that needed to be migrated was the code that provisioned the test system and invoked the test programs.  This took a few days of effort and I was left to review the new migrated test suite.  Everything looked fine.  The only issue that I had was that the test method names were a little un descriptive.  I spoke to the engineers and requested that they ran the test in a debug mode to understand exactly what the test did.  I didn't want them to do any static analysis of the source, but to see what the test did at runtime.

A day later I met with the engineers to see how they were progressing,  they were hitting problems.  As fair as they could see the test never invoked any of the APIs that they would expect given the name of the test suite.  I asked them to show me and I had to concur.  It did appear that the test didn't use the technology we were expecting.

I gave the engineers a list of diagnostics and further tests to double check this finding.  After this was done it was clear the test was simply not testing the function that we thought it did.

The problem with this test is clear.  There had been an assumption that the test 'did what it said on the cover' and that since it always passed it was considered a great test asset.  In fact this was probably the worst test we had.  It built a false level of confidence in the team and could have let regressions into the field.

Naturally we have deleted all evidence of this test suite from all source code repositories and test archives.  However even this action was not without complaints.  A lot of the team felt that even though it was clear the test didn't do what we expected it did do 'something' and so we should keep it. This is the wrong thing to do!

I agree that the test did execute some of the SuT function.  However the code it did execute it only exercised and did not test it.  If the test found a regression would it report the problem or not?

Because the test always passed it wasn't until we decided to migrate it that this problem was uncovered.  Tests that fail regularly (either due to a regression or a test failure) at least get eyeballs on them.  These checks implicitly validate that the test does something of valid use.

So what did we learn from this:

  • Code coverage is great at ensuring that a test is at least executing the code you think that it is.
  • If the test is executing the code you expect and it regularly passes it is worth checking that it is testing the code and not just exercising it.
  • We needed more tests in this area - we have now done this and ensured that they actually do what we expect.

Simple Algebra - solution

First - get rid of the images and substitute some characters that are easier to manipulate:
  1. a + b - c = 4
  2. a - 2b + 3c = -6
  3. 2a + 3b + c = 7

Add 1 + 3 together:
 a + b - c + 2a + 3b + c = 4 + 7

        4) 3a + 4b = 11

Multiply 1 by 3

3a + 3b -3c = 12
add equation 2

3a + 3b -3c +a -2b + 3c = 12 - 6
       5) 4a + b = 6

solve 4 & 5

3a + 4b = 11
4a +  b = 6

b = 6 - 4a

3a + 4(6 - 4a) = 11
3a + 24 - 16a = 11
-13a = -13
a = 1

3a + 4b = 11
3 + 4b = 11
4b = 8
b = 2

Substitute into 3) to get c
2a + 3b + c = 7
2 + 6 + c = 7
8 + c = 7

c = -1