Foreword: This article was written when I was working at Davidson. Thanks to Jimmy RUNDSTADLER and Luc DEHAND for their review. On another note, I am a victim of my own limitations and biases. I am part of the Software Craftsmanship movement and an Agile coach. My vision is influenced by these concepts. I do not claim to present the only approach here, nor even the best one. It is simply mine.
If you ask developers what legacy software is, I’m willing to bet you’ll get as many definitions as there are people. Yet, if you put those same people in front of any source code, the question is quickly answered. It’s that confusing, spaghetti code that dozens of generations of developers have worked on.
Already, as you might have noticed, I mentioned source code. A software isn’t legacy because of its features or performance. There is highly performant legacy software, and there is non-legacy software that is released in a dismal state (thoughts go out to the Paradox teams during the release of Cities: Skylines 2).
From now on, we will use the term legacy code. Personally, I use Michael Feathers’ definition:
“To me, legacy code is simply code without tests.”
Only tests can guarantee that the code works. Without tests, it becomes complex to modify the code without introducing regressions.
Taking over legacy code is therefore a real challenge. Yet, it is a critical business issue. Major applications are often legacy. These are the projects that usually hold the most value, and therefore generate the most revenue. It’s impossible to get rid of them, impossible to unplug them, impossible not to maintain them, or to resist adding new features…
So, what should we do? It’s time to change the narrative. Taking over legacy code is not a punishment. It’s an opportunity.
The Urgency of Doing Nothing
When arriving on a legacy project, let’s start by not scattering our focus. It can be tempting to rewrite everything. Refactoring is often perceived as a costly and risky operation. Therefore, it is better to observe and focus on your scope.
First, what is the perimeter:
- Do we need to overhaul the entire software’s functionality?
- Do we need to fix regressions within a limited scope?
- Do we need to add a new feature for a restricted number of users?
The first step is to restrict your effort by defining the perimeter.
Then, we move on to code analysis:
- Does it compile?
- Are there any tests? (unit, integration, functional…)
From there, you should have an idea of what tests need to be added to deploy a safety net.
I No Longer Doubt, Because I Test
Before anything else, you must stay calm. The magnitude of the task can sometimes be terrifying. Indeed, since the code is poorly controlled, any mistake can blow out of proportion. However, it is possible to secure yourself. For this, nothing beats covering the code with tests.
The authors of Software Craft describe two possible strategies:
- Understanding the business logic as the foundation for the tests.
- Capturing the code’s behavior in a Golden Master around which to build the tests.
I align more with the second strategy. There are cases where business knowledge is too scattered or even lost. In such cases, I think of what Arnaud Lemaire says in his conference:
“What matters is not what the code is supposed to do, but what it actually does.”
The Golden Master Technique
A Golden Master is a “recording” of all the software’s behaviors for the scope of intervention. For example, if I am working on adding a payment line in an accounting software, the scenario might be:
- Input of payment parameters (date, time, location, person, amount…);
- Verification of input parameters;
- Saving to the payment table;
- Reading the new entries in the table;
- Displaying a confirmation message for the added payment;
- Displaying all payments.
We then detail each step to cover the functionality with unit tests that are automated and executable on the fly. When making code testable, you are faced with dependency issues. These must then be isolated (mocks, stubs).
Tip: Do not try to reach 100% coverage. It’s an illusion considering the total cost. Aim for 80% overall coverage, with an emphasis on all critical functions.
Modernizing the Code
Deploying a safety net is barely the beginning. Keep in mind that this does not immediately facilitate refactoring, but it allows you to work safely.
Thus, we move on to refactoring. From experience, I believe it is necessary to:
- Tune the instruments: I would advise starting from Clean Code and adapting as needed. Rules are made to evolve to gain everyone’s buy-in. However, if you have to read a multi-page documentation to understand what a certain variable does, it’s time to adopt explicit naming conventions. The code must be understandable without requiring irrelevant cognitive effort.
- Reduce complexity: Avoid deep nesting, nested loops, and too many public methods. Here, I apply the SOLID principles.
- Perfect is the enemy of good: Trying to optimize every piece of code is expensive and doesn’t always yield a significant gain. Are you looking for a global or local optimum? Are you improving code comprehension?
- Undo to redo: It can be interesting to temporarily degrade a piece of code to highlight hidden problems (e.g., renaming variables to reveal duplicated code).
- Confront ideas: Work in groups (pair-programming, mob-programming). There is strength in numbers.
When Does It Stop?
Once started, it’s tempting to keep going and imagine rewriting everything. However, let’s keep in mind where we started and what our perimeter is.
It is futile to seek perfect code. The notion itself is hollow. But let’s not forget that we must maintain a balance between cost and gain.
Your minimal goals should be:
- Readability: We understand what each variable, function, and class does. The flow can be followed easily.
- Maintainability: A modification does not cause unforeseen side effects.
- Comprehension: Tests cover the code and document how it works.
To illustrate this notion, I will use the quote from Kent Beck:
“For each desired change, make the change easy (warning: this may be hard), then make the easy change.”
The effort put into legacy code can be infinite, so you must frame your scope and not seek perfection. Congratulations, you have made life better for the next generations of developers.
References and Inspirations
- Michael FEATHERS: Working Effectively With Legacy Code, Addison-Wesley, 2004
- Martin FOWLER: Refactoring, Improving the Design of Existing Code, Addison-Wesley, 2018
- Cyrille MARTRAIRE et al.: Software Craft, TDD, Clean Code et autres pratiques essentielles, Dunod, 2022
- Arnaud LEMAIRE: Le projet Legacy, quelles stratégies pour s’en sortir ?, YouTube Conference, 2021
- Kent BECK: Twitter/X Status, 2012
- The Refactoring Guru website
- The Software Craftsmanship Manifesto