META: This is also published on Medium.
“Put all your eggs in the one basket and –WATCH THAT BASKET.”
Mark Twain
Conventional wisdom in software engineering prescribes Unit Testing as a principal testing methodology.
Compared to unit tests, a test harness is a [usually] simple standalone application that incorporates the system under test, and presents a user interface. Since it is for testing, it may expose internal metrics and allow the user to access parts of the code that are usually hidden.
I prefer using test harnesses. Here’s why.
This Is Not a Diatribe Against Unit Tests
Unit tests are great. I use them frequently. They are a terrific way to ensure that certain stable criteria are met and do not change when the code underneath them changes. They are an almost required tool for refactoring.
Now that THAT’S Out of the Way…
Unit Tests Are Concrete Galoshes
I write about that here.
Simply put, unit tests; especially when used in TDD, add a great deal of “rigidity” to a project. They are “anchors,” meant to keep it solidly in place, and to prevent it from changing in unexpected ways, or generating atavisms (re-animated “zombie bugs”). That’s their job, and they do it well.
The way I write software tends to be “ultra-flexible.” I call it “paving the bare spots.” I usually defer a lot of the “pre-project” structure that is so common in software engineering.
I write about that here, here, and here.
I tend to “discover” my design as I progress through a project. There’s no way that I can map out an entire project design beforehand, as is necessary with TDD.
Unit Tests Work Best for Predictable Issues
By definition, unit tests test predictable bugs; or bugs that have already been encountered. They are often reactive, as opposed to proactive.
A good planner can foresee many types of issues, and write unit tests in advance, to validate the code that addresses them, but, by definition, this is “low-hanging fruit.”
No matter how good I am, I WILL miss bugs; sometimes, really bad ones. I have been constantly surprised by the bugs that crop up in projects that I would have SWORN were “perfect.”
Libraries Are Easy to Test
“Engine” code (like math or data processing libraries) is, by definition, predictable and scope-constrained. There’s almost always a strict API, and a well-defined functional specification. Pretty much the ideal climate for effective unit tests. There’s only so many ways the internal code can be exercised, and, with good planning, I could, conceivably, write tests that cover the entire spectrum of the library.
But User Interface Code is Not So Easy
End-user code, especially with a rich, high-usability, localized user interface, is an entirely different matter.
A good UI doesn’t have many “fences.” Users can go pretty much anywhere they want. I like to encourage this, but it does make maintaining quality and consistency a challenge.
There are definitely ways to test UI automatically. Apple has a great system. Most run scripts. It’s difficult to do true, random testing of UI in an automated fashion.
The Same Goes for Device Driver Code
I have had bad luck in designing unit tests for device interface code. They can be good for things like parsers, but can be worse than useless when it comes to dealing with data provided from an unpredictable source.
I wrote about an example in my experience here. That was a harsh lesson in the risks of getting too caught up in orthodoxy.
My Unit Tests Tend to be After the Fact
As I tend to take an “evolutionary” approach to development, I will sometimes not have a stable structure until near the end of a project, and can’t write predictable unit tests until then.
Nevertheless, unit tests are almost perfect for maintaining quality over the long run, so it’s not a bad idea to add unit tests where we can, and maybe add those tests to some sort of automated CI/D system.
Enough About Unit Tests. On to Test Harnesses.
I work in a layered manner, with an expectation that things will be changing as I go. I start off by assuming that I can’t write a useful design specification, so I don’t even try. That makes an approach like TDD quite difficult for me.
My Tests and Code “Grow Up Together.”
When I begin work on a new section of code, I generally start off by writing an empty (not “failing” –EMPTY) test. This test usually expects an integrated (not partial) set of code. In other words, if I’m writing a driver with a parser, I don’t write a test that circumvents the input handlers and injects test data into the parser. I write a test that simply examines the system output, and expect that the input will come through “the proper channels.” In some cases, I may have the system emit some debug data, but the tests always work on the integrated, working system, not a mocked or partial one.
Then, as I write code, I fill out the test to immediately start testing that code; often stepping through the code before I’ve even finished writing it. I frequently change course while writing code, and have to adapt the test to my new direction.
But That Can’t be Automated!
Yup.
Note that in the banner image, the dirt bike has a rider. That was not accidental.
Test Harnesses Are Reusable and Flexible.
A nice side-effect of this kind of test harness code, is that it can remain to become an integral part of a debugging and analysis suite. It isn’t restricted to a small domain. It can be repurposed or modified to fit new, unanticipated workflows or issues. Which brings me to…
Test Harnesses Are Great Debugging Aids.
I have found that a test harness gives me a great context for debugging code as I develop it, using the test harness to establish a context that may stress or “bring the beast out” of the system.
For example, when testing a parser, I may ask the system to establish a large cache, and that’s not something the parser will do, but the post-parser “scrubber” will. The large cache may exacerbate the stress on the parser, so it could be a good tool for testing “edge” cases.
Another nice thing, is that if I get a bug report, unit tests are almost worthless. However, I can modify my test harness to reproduce the conditions in the report, and bingo. I have a controlled environment to analyze the bug.
Test Harnesses Start Integration Testing Early.
It’s been my experience, working on large systems, that integration is where all hell tends to break loose. The sooner we start testing that, the better.
I’m a huge proponent of early integration testing. I will try to establish an integrated solution as quickly as possible. If it is layered, each layer will usually be tested as an integrated whole, with the goal of establishing integration testing of the entire system, as soon as is reasonable.
That may mean that significant portions of the system may be “stubbed,” or “pass-through,” and subject to change, but the system structure will be in place as early as possible.
Test Harnesses Are Excellent Sample Code.
One strong argument for unit tests, is that they are useful as sample code for implementors of a system.
My experience is that only a couple of unit tests may be suitable for this, and their structure is often not particularly well-suited to the way that a user might actually implement a system.
Test harnesses –at least, the ones I write– are much better for sample code. I tend to write test harnesses as fully-qualified applications; complete with error management and localization. They are usually good enough to be used as the “seed” for a user implementation.
Let’s Do A Quick Comparison.
The first thing that we might think, when looking at the above image, is that unit tests test a lot more of the system than test harnesses.
That is –mostly– correct. Well-done unit tests can give tremendous code coverage. There’s a reason that they are so ubiquitous.
If I use a test harness, then the onus is on me to use it carefully, and make sure that I exercise the codebase. This is quite possible, with a well-designed harness, a well-designed system, and a disciplined approach.
My test harnesses tend to be gigantic. The vast majority of code in my projects is often testing code. Here’s an example. Almost all of that code is there to test a single, 300-line file. Take note that there are a significant number of unit tests involved, as well as multiple test harnesses.
Note the Device and Operating System Are Tested by the Harness.
This is an important qualifier. If we are designing widgets, and the “Device” layer is one of our widgets, then the test harness also becomes a widget test; which is very useful.
It is possible to write automated unit tests that work with devices (maybe controlling them through a communications interface). I was writing tests that did exactly that, in the 1980s.
However, it’s a big job, and VERY hard-set concrete.
Even though we need to rely on the operating system to be of extremely high quality, having it in the test loop can make a big difference. For example, if we use a certain call, it may call a closure in a separate thread from the main one; triggering a bug. Our unit test may not do this.
It’s VERY Important to be Disciplined.
A lot of modern project management is built around treating engineers as “interchangeable cogs in the system.” If the process is perfected, then the quality of the engineering staff can vary greatly; while getting consistent results.
This is an extremely valid approach. It actually works pretty well; depending on the process. Unit tests are an important part of this kind of structure.
In my case, however, I need a lot more flexibility, and I already know the engineering staff (Yours Troolie) is good. I don’t have to design a perfected process.
I am incredibly disciplined. I worked for decades at Nikon; and was inculcated into a culture of habitual quality. I’m also a halfway decent engineer, with a fair bit of creativity, so I can pull this off. YMMV.
A Human is An Important Ingredient
There’s a very important (and, in my opinion, correct) theory that we should automate as much as possible in our development and delivery structure, and remove the human element.
But that is not always the best way to deal with stuff, in the way that I work.
First of all, the human involved is me. I’m an engineer with over thirty years of experience. I am an outstanding debugger and designer, and tend to find bugs very, very quickly. Some day, an AI will, I’m sure, be able to outpace me, but we’re not there, yet. Even so, when that day arrives, I’ll lay odds that small shops like mine won’t be able to afford it.
Second, as soon as we set up a CI/D pipeline, we have established a “concrete galosh.” That is not necessarily a bad thing, but we shouldn’t (in my opinion) rush to get to that point.
In my opinion, there’s just no substitute for experienced, capable, and, above all, adaptive engineers. I think that it’s important for an organization to open their wallets and fork up the dosh for good talent and good training. I don’t think that there’s too many cheap engineers that will provide a “magic bullet” for quality engineering.
Test Harnesses Are Not the Only Tool
As stated above, I tend to use a hybrid approach to all of my engineering.
Test harnesses are but one tool in my utility belt; albeit a very important one.