Journey of a Contemplative Architect: The pyramid of automated testing - or why manual testing as a starting point is a bad idea

The ideal versus the reality

Automated testing is one of those topics that when mentioned, everyone nods their heads in concert to the tune of "yeah we should do that". And in a similar union the response to "should we do it now?" is "we don't have the time for that".

Why is that?

One common reason is that we often approach automated testing entirely incorrectly and try to apply it as a magical remedy to legacy software (in this scenario meaning software that isn't designed to be automatically tested - which covers a large swathe of contemporary software landscape).

The cone of test automation from hell

And when we do attempt automated testing, it's often of the variety "let's build end-to-end UI based tests against this legacy UI with billionty services behind it, and cover all the error cases". Developers have perhaps heard about JUnit and are proud to have added a test suite of three unit tests to the code base. Someone also suggested that perhaps we could create a few tests which work against a specific system's SOAP web service API and seven tests like this were hand-written over a two week period.

And then the all-encompassing UI end-to-end tests were created to handle all the rest.

Sound familiar?

If not, count yourself amongst the lucky. This is a common approach automated testing often seen in the corporate wilds.

It is known as the cone of test automation from hell (as characterized by Venkat Subramaniam, again, https://www.youtube.com/watch?v=uQ75fI1tqoM).

Why is it the cone of test automation from hell?

Because the test that are the easiest to write, fastest to run and most handy to maintain are the fewest in number while the automated end-to-end and manual tests which take the most resources to maintain and execute are the most numerous. This is the short-term thinking in play.

So what's the alternative?

Pyramid of test automation

The right way

The pyramid of the test automation is structured as follows:

The base layer - creating the foundation everything else rests upon and of the highest volume - is constructed with unit testing at the code level

These are the tests which are very fast to both write and execute, and also easy to maintain considering the refactoring tooling in contemporary IDEs
Always run in their entirety as part of both developer builds and CI/CD pipeline build phase
Can cover close to or even fully 100% of cases in the code
Can cover non-happy paths extensively
Can ensure that various entirely unexpected input is also handled in a consistent manner providing high resiliency and fault tolerance
But what they do not do: ensure that different components work happily together, that is left to the next level

The middle layer - service-level tests

These are the tests written against the API of a single service (often in a microservice architecture).
They ensure that the service behaves according to its public contract (the API, see my post on API First principle etc. https://contemplative-architect-journey.blogspot.com/2019/12/the-value-of-internal-api-ecosystem-and.html)
The database and upstream service integrations probably need to be mocked so that the service level tests can be run in isolation
There can also be variants of the service level tests that on one side test the individual service together with upstream mocks and on another test against some pre-configured mock data returning upstream services and database. Details dependent on context and implementation
High coverage of happy paths but only covers the major functional non-happy paths
These tests are not as fast to execute as the unit level but are still often incorporated in the CI/CD pipeline build phase

Optionally some portion can be part of a later stage of acceptance testing in case they take longer to execute

The top layer - the fewest number of tests

Test full end-to-end chains starting from current service's context and covering upstream (obviously not downstream - that's the downstream services' / applications' job)
Either API or UI based depending on the type of service or application being tested
Usually executed in an environment which is production-like in that the connections between services correspond to production level interactions and nothing is mocked (data is often dummy stuff though but that shouldn't affect the overall interactions at API traffic level)
Expected to take significant time to execute
Ideally these don't cover individual features but rather the major customer use cases and flows

When new features are implemented, each new feature doesn't add a new top layer tests but instead usually updates the use case test flows that it affects. Dependent on the case

Covers non-happy paths only when they're relevant for business scenarios and significantly differ functionally from the happy path

Tip of the iceberg - manual testing

Ideally not for verification at all (or only at a very light-weight level)
Instead for exploratory testing and to create insight about the customer experience and how to further improve the end-to-end chain in its various parts

Orthogonal testing aspects

There are some specific types of testing that are not covered by the above-mentioned perspective. This is a list of some of them (and these often need to be handled by specialized personnel in high quality environments):

Penetration testing

Security is a nasty business in that at worst a single mistake anywhere in the chain of calls could potentially expose you to data theft, content defacing or even backend admin access
There is no panacea for security testing. This is a topic which requires a separate post to get even started on the fundamentals
Don't try to include this in the standard test suite
Instead, follow secure-by-design coding practices and design principles and defense-in-depth or security-in-layer architectural principles
For most enterprises it's not feasible to build in-house penetration testing capability so if you're working with important customer data or otherwise work with data assets that are important to safeguard, consult firms specialized in this stuff

Performance testing

This should be done in-house and potentially by the same team but it could be a more centralized capability as well
Use some of the same API level or UI level test suites but tooling might be different
Aim to test some kind of peak load (which depends on usage scenario), scaling performance, etc.

Failover testing

This is also challenging to include in the functional tests but it potentially could be
Important aspect especially for highly available services is ensuring that failover happens smoothly
Less important in a modern Kubernetes context - there the more relevant testing is the next category

Chaos engineering / chaos testing

In an advanced context you should definitely start employing chaos engineering which is an extreme form of testing - randomly every now and then introducing various kinds of faults, delays, disruptions and even entire data centers going down - to your production systems!

Less extreme version is doing this in test

Championed by Netflix as a method of ensuring that they're up and running even in the most extreme of circumestances
Has plenty of tooling available nowadays. Basically requires a highly functional Kubernetes based environment
Wiki page is a good starting point https://en.wikipedia.org/wiki/Chaos_engineering

Other rare non-service related failure scenarios testing

E.g. disaster recovery being a very important one that is often neglected - sometimes to a great detriment

Journey of a Contemplative Architect

Sunday, 1 December 2019

The pyramid of automated testing - or why manual testing as a starting point is a bad idea

The ideal versus the reality

The cone of test automation from hell

Pyramid of test automation

Orthogonal testing aspects

No comments:

Post a Comment

From Architecture to Game Development: A New Blog on Echoes of Myth

Search This Blog