Legacy code

Notes from Working Effectively with Legacy Code by Michael Feathers

The four reasons to change software

Adding a feature
Fixing a bug
Improving the design
Optimizing resource usage

The common thread for all of these is preserving existing behavior. Only a small subset of behavior is actually changing – for cases 3 and 4 no outward behavior changes.

Preserving existing behavior is one of the largest challenges in software development. Even when we are changing primary features, we often have very large areas of behavior that we have to preserve.

Bad risk mitigation practices

The author notes that many developers faced with legacy code try to minimize risk by avoiding necessary re-factoring of the system. Code is added to existing methods and existing classes are extended in order to minimize the amount of physical modifications that are required. This has the net effect of reducing the maintainability of the system.

Simply minimizing the quantity of change required is not a sustainable way to minimize the risk of change.

How do you know if your changes broke useful behavior?

Edit and pray - analyze thoroughly beforehand. minimize changes. Use functional testing: test, test, and retest afterward, test cross-cutting parts of the system afterward. Gives the impression that is a very cautious technique, but is it effective?
Cover and modify - Give yourself a safety net by adding quality unit tests around the section of code being modified. We will know instantly if we have broken known good behavior. ("code that bites back")

Software Vise

When we have tests that detect change, it is like having a vise around our code. The behavior of the code is fixed in place. When we make changes, we can know that we are changing only one piece of behavior at a time. In short, we're in control of our work.

True unit tests

Run fast
Help to localize problems.

A unit test that takes 1/10th of a second to run is a slow unit test.

True unit tests do not:

talk to databases
talk over a network
touch the file system
require changes to the environment (i.e. editing a config file) to allow the test to run

Don't forget the importance of less frequently run higher-level tests for the system as a whole:

Unit tests are great, but there is a place for higher-level tests, tests that cover scenarios and interactions in an application. Higher-level tests can be used to pin down behavior for a set of classes at a time. When you are able to do that, often you can write tests for the individual classes more easily.

Developing tests for legacy code

Problems that can arise trying to put legacy code into a test harness. We are trying to create tests that run fast and don't have side effects.

Dependency on external systems, data models, etc. are baked into the code we need to modify - the logic we need to test hasn't be sufficinetly abstracted.
We can't physically instantiate a particular class in our testing harness:
- It tries to pull in external libraries and APIs that can't run in the testing harness.
- Constructing of the class requires passing objects we can't create. (an "irritating parameter").. Examples: DB Connection, network socket, etc.
- The code we need to test is tied directly to event handlers in GUI or other UI code that cannot be executed independent of user action.

Dependency is one of the most critical problems in software development. Much legacy code work involves breaking dependencies so that change can be easier.

The Legacy Code Dilemma

When we change code, we should have tests in place. To put tests in place, we often have to change code.

A structured way to change legacy code

The stepwise procedure:

Identify change points.
Find test points.
Break dependencies.
Write tests.
Make changes and refactor.

Child pages