A cornerstone of modern software development is developer testing. That means that developers are the primary authors of automated test code. In theory, that is a good thing and might look like the quality assurance department is out of work soon. In practice, we as a profession tried for nearly twenty years to install a culture of developer testing in our work and still end up with software projects that feature no automated tests at all (Side note: JUnit 1.0 was released in February of 1998).
What we know about automated tests
One piece of common understanding about developer testing is the test pyramide. Let’s iterate quickly what we know about it. There are different kinds of automated tests and the test pyramide differentiates three of them:
- Acceptance tests or UI tests are the heaviest type of automated test. They operate on the software from the outside, with the means of a real user and try to assert that real use cases are accomplishable.
- Integration tests often use several parts of the system in a test scenario that asserts the correct collaboration of the parts. Integration tests may take some time to come to a conclusion and utilize real hardware like network or disks.
- Unit tests tend to be small and quick and focus on a particular aspect of an “unit” like a class or entity aggregate. Their reach into the system should be short and might be forcefully restricted by employing mocks.
These three types, the A, I and U of automated tests, should come in different numbers. A good rule of thumb is that for every acceptance test, there might be up to one thousand unit tests. If you draw the quantities as areas, they appear in form of a pyramide. A small top of acceptance tests rests on a broader seating of integration tests that relies on a groundwork of many unit tests. A healthy test pyramide looks like this:
Take this picture as an orientation, not as an absolute scale. But be sure to count your different test types from time to time.
Outlining the tests
This is actually one of the first things I do when I get introduced to a new and unknown code base. This happens quite often when I do consulting work for existing development teams. Have a look at the automated tests, determine their type and count their numbers. If it resembles anything close to the test pyramide, you’ve got a chance. If the resulting shape looks different, you might find this blog entry useful:
If you have a hard time finding any tests (because there are none) or you find only some half-assed attempts to produce a meaningful automated test suite, you look at a tower project. The tower is rather small in diameter, in the cases of absent tests it is nothing more than a thin vertical line (the “stick”). If you find a solid number of tests for every type, you’ve found a “block” project. Block projects usually don’t have a problem, but a history of test effort migration either from unit to acceptance tests or, more common, in the other direction. If you find a block, you are fine.
The tower, though, is a case of neglect. The project team might have started serious efforts to automated their tests, but got demotivated by intrinsic or extrinsic influences and abandoned the tests soon after their creation. Nobody has looked after them since and the only reason they still pass green is that they didn’t really test anything to begin with or only cover an area of the system that is as finished as it is boring. Topics like user management or utility classes are usually the first and only things that got tests in a tower scenario.
Don’t get me wrong, the tower indicates the absence of tests, but not the absence of willingness to write automated tests, unless the tower is really a stick. A team willing to invest in automated tests may only lack knowledge and coaching about the topic. Be sure to lead them bottom-up (unit tests first), though.
If you’ve categorized and counted the tests and couldn’t find many acceptance or unit tests, you’ve found an egg. The egg consists of mostly integration tests that may lean into unit testing territory by asserting smallest bits of functionality here and there (often embedded in an overarching test storyline) or dip their toes into gui-based testing by asserting presentation-specific properties of widget objects. While they provide ample test coverage for the system, they also tie application logic and presentation details together and don’t help to separate domain code from the use cases.
The project team is probably proud of their test coverage and doesn’t see any value in differentiating the automated tests types, because “every test improves the situation”. The blindness to test types is the core problem that may be cured with training and coaching (I’ve found the ATRIP-rules to be particularly effective to distinguish integration and unit tests), but the symptoms, especially the lack of separation of concerns, have to be mitigated soon, too.
One way to start there is to break the tests down into their integration and their unit test parts. You can work from assertion to assertion and ask: is this necessary to ensure the current use case? If not, extract a new unit test focussed on only this one assertion.
As soon as you add a pedestal consisting of unit tests to your egg, you are on your best way to a healthy test pyramide.
The Ice Cream Cone
This is the most fearsome automated test outline in existence, even more dramatic than the stick. Usually, the project team is really enthusiastic about writing tests or at least follow order to do so, but they cannot test parts of the application in isolation. A really tragic case was a complex system that was so entangled with its database, through countless stored procedures that contributed to the application logic, that it was hopeless to think about tests without the database. And because every automated test had to start the whole system including the database, there was really no need to differentiate between application logic and presentation logic. It all became a gordic knot of dependencies that enforced the habit of writing elaborate automated GUI-based tests to test the smallest logic bits deep inside the core. It felt like eating single rice grains with overly long, flimsy wooden chopsticks that would break often.
The ice cream cone is problematic because the project team needs to realize that their effort was mislead and the tests are all telling the bitter truth: the system’s architecture isn’t fit for proper automated tests. It’s not the tests, it’s you (or your architecture)! Nobody wants to hear that and more so, nobody wants to untangle the mess (without the help of a proper safety net consisting of automated tests). Pinning tests are probably helpful in this scenario.
But you need to turn the test pyramide around or the project team will suffocate by the overly costly test tax while increasing technical debt.
Please keep in mind that it’s not a problem in itself that your project doesn’t have a normal test pyramide. It’s great that you have automated tests at all! But your current test type distribution might not be as effective as possible, might be more expensive than necessary and might be not the right automated test setup for your development goals.
What are your stories with automated test setups? Care to share it with us in the comments?
6 thoughts on “Look at the automated tests to diagnose the project ailments”
Always fighting with the ice cream cone.
In a project driven company there is nearly no chance to change the architecture. And if we try to do changes it takes like forever to improve something because there is only one or two developers who are allowed to change it.
thank you for your comment. You mentioned that in a “project driven company”, its not possible to change the architecture. May I ask you what you mean? Because we are a project-driven company, too, and its easy to change the architecture to the better for the next project. Or do you mean, changing the architecture of previous projects? Those are mostly inactive, so why change anything? Or is it the dreaded half-zombie project? Funded just enough to keep going (at a slow pace), but not enough resources to really to considered worthy of an overhaul? That’s a sales fail that resurfaces in development, IMO. If you need to make changes to a project, request enough budget to make it right. A complete rewrite of the project a few years down the line will be more expensive, require more time and probably have no smooth transition phase.
Anyway, I’m interested in your company’s policy towards changes in projects. That “being allowed” part will be my next question.
Looking forward to hear from you again.
I see your point. It really depends on the company and product portfolio.
In my company (~500 employees) we have one massive software product consisting of thousands of features. Usually not all of those features are given to each of our customers as for the most features they have to buy a seperate sw license.
The result is hundreds of customers each paying several hundred thousands of € to receive a software with unique feature combinations.
As they want to keep the software for at least 10 years running. We sell a service contract to keep the system running and be available for change/feature requests.
This means for us that it is very rare that a project is really “finished”. Our software will never be finished as long as we don’t start again from scratch (which is really not planned).
This is good to earn some money now, but as all development departments are busy fulfilling CRs there is no check and improvement for the whole architecture.
Even the so called “architects” do not get the time to analyze and improve the current software architectural failures. They are all stuck in project work (customer, docs, development, …)
To get back to your point:
This is not a general problem in project driven companies but is in our specific case.
Mainly caused by the missing company structure for the development (No test team, no dedicated architects, no system assembly team) and the highly active customers who want some unique software for their plenty money.
My personal view on that company strategy:
For the software this is really bad as each developer can just do what he wants (nearly no code reviews, coding guidlines, etc.). In fact there are no “real developers” anymore. Everyone is a dev-consultant creating system design documents, filling out cross-reference matrix’, fixing bugs in the code, doing 3rd level support, testing, installing software at customer site and compiling new features together with the customer.
I like the variety but in my opinion it is not very efficient to do it that way.
thank you again for your reply. I understand you much better now. Let me rephrase your situation in my own words: Your product is a framework that got old and bloated. Every customer seems to buy some licenses, but in reality buys an underfunded project on top of the framework. Unlike 3rd-party frameworks, all the project work is merged into the framework and left there to rot (or more specifically: to not rot! It must not break in the next 10 years! Be careful with your next underfunded project to not knock it over!). Because you tackle each project on its own, you always play on the tactical level and never on the strategical level (with things like architecture). So you get all the problems of short-term projects and the problems of long-term frameworks in combination. I can see why this might be challenging and inefficient after some time. And I understand why it provides “variety” for the developers.
Good luck to you! You’ve identified the problem space very clearly. I’m sure you can think of some solutions or improvements. If you want to discuss the matter further, I’m interested in your thoughts.
Thanks for the article, looking at the test architecture and inferring ailments is very interesting and helpful. Of course I immediately applied it to my own company: We are often developing hooks for LLVM libraries (e.g. AST-Visitors), which we test through LLVM and with (IR-)code snippets as test input.
So would you consider these to be unit tests, or integration tests and correspondingly an egg as test structure? These tests are A-TRIP, but since they are relatively slow and many units and calls are involved, I consider them integration tests. How do you use A-TRIP to distinguish unit tests from acceptance tests?
We did develop our own fluent IR builder just for testing, but since the LLVM IR is quite complex and (IR-)code snippets are simpler to produce and read, we fell back to not using the IR builder. What is your opinion on the following hypothesis: The test structure also depends on the domain? For instance for compiler construction, focusing on business value (production code and thorough tests) while accepting an egg test structure might be a better approach than producing a complex test library and less readable tests for a clean test pyramid.
good to hear from you.
First: The A-TRIP rules primarily apply to unit tests, but cannot distinguish between test types. They might differentiate between better and worse unit tests, but say little about unit vs. other tests.
I also think that you have a lot of integration tests and that is what suits your needs, because your application has a strong coupling with another product (LLVM in your case).
But I’m not sure about your hypothesis. It might be true, but I have not enough experience in the field of compilers to answer your question. I still think that small, uncoupled tests are the best way for fast feedback, and immediate feedback is what makes automated tests powerful. If this goal is achieveable in your domain is out of my knowledge.
I agree that a pragmatic approach (accepting the egg) is better than to abandon automated testing altogether (because it’s “not possible”) or writing tests that are “write-only” and hard to change. But it seems that any of your tests depend on a whole bunch of prerequisites and unchecked assumptions that might weave a web of coupling that’s hard to untangle if necessary in the future. I would consider that an investment risk (and take it if there is no other choice but high reward).