Automatic Test Code Generation

By gmkayaker

Recently, I was asked by a visitor to my website http://xunitpatterns.com:

What are your thoughts on using automatic test code generators for new projects? My team is just starting with TDD. We are currently debating the value of automatically generated tests by JTest (which we would enhance) versus home grown tests using EasyMock & Eclipse.

I’ve seen tools like JTest demonstrated at various Agile conferences and they can certainly provide value in helping get a mass of legacy software under control. (Note that many people in the agile community consider any software whose maintainers are not protected by automated unit tests as “legacy software”.) But this question specifically relates to the use of test code generators for new projects doing Test-Driven Development (TDD.) This got me thinking about my position on the whole topic of test generation.

But first, for those of you are not familiar with these test generation tools, here’s a quick, over-simplified summary of their value proposition: You fire up the tool, point it at a chunk of existing code such as a JAR or a DLL and it goes to work. It examines the public interfaces of the classes, does static code analysis on the structure of each method and generates tests that exercise the code. You can learn more by going to the vendor’s web sites (e.g. http://www.parasoft.com/jsp/products/home.jsp?product=Jtest)

So let’s examine these two cases separately:

  1. Legacy code base needing some enhancements
  2. Building new code using TDD

I can definitely see some potential benefit for teams that have a legacy code base and are afraid to make changes to it. Generating a suite of regression tests for the code as it currently exists will allow them to do experiments such as “What happens if I change the return value of this method? How far and wide does that impact the code base?” The regression tests help “lock down” the functionality already in the code. But how will that help us actually refactor the code? If we go ahead and change the existing code, we should expect some existing tests to break and we’ll have to decide what to do with them. We could just regenerate the test code after we are done but then we really haven’t taken much advantage of them while we are refactoring. As a minimum, we should be looking at the tests broken in each step of the refactoring and/or enhancement to see it there is anything that should alarm us. How easy this will be to do will depend very much on how easy it is to understand the tests, and that depends on how “intent revealing” is the generated test code.

Disclaimer, I haven’t looked at any of the generated test code beyond the trivial examples shown in demos so I’ll leave this decision to you.

One strategy might be to determine whether the broken tests are about the internals or the interface contract (as in Bertrand Meyer’s “Design by Contract”) of the software we are changing. Ideally, we won’t break any of the interface contract tests during our refactoring; if we do then we’ll certainly have to think about the implications on the clients.

Now let’s look at the 2nd scenario, building new software using TDD. Test generation software works by examining the existing software. How will that work when we haven’t written the software yet. Or from a development perspective, TDD is about expressing our intentions of what the software should do when it’s done. This definition of “what done looks like” is what allows us to avoid writing a lot of unnecessary software by only writing just enough code to pass the tests we have already written. To work properly in TDD, the test generator would have to be able to read my mind to determine my intent for the software I’m about to write. I might be able to use the test generator to generate the templates for the Test Methods
once I’ve defined the interface but even then, will it be able to determine all the test conditions for which it needs to generate Test Methods? I’m skeptical on this point but I would certainly like to hear from anyone who has tried these tools in conjunction with TDD.

One final comment I’d like to make is that there is no substitute for hard work and discipline. I’m a great fan of trying to work more effectively and I encourage all the teams I work with to hold regular retrospectives. Retrospectives are just one disciplined way in which teams can reflect on how they develop software and plan experiments that may yield improvements. Trying code generation tools might be just such an initiative. It is important, however, to properly frame the experiment before diving in so that we don’t get into a revolving tool-of-the-month situation that detracts from productivity in the long term as we flit from tool to tool; that certainly wouldn’t help improve the predictability of team’s output as measured by our team velocity.

Tags: , ,

Leave a Reply