This post outlines a basic approach for piecemeal legacy code replacement. The basic steps in the approach are as follows. (I will describe each step with mode detail below.)
(1) Determine the partitioning of the new application at the subsystem level.
(2) Select a subsystem from the new application to implement.
(3) Implement and test the new subsystem using good OO practices.
(4) Locate the corresponding functionality in the old application.
(5) Insert the interface to the new subsystem into the old subsystem to isolate the existing functionality.
(6) Test the original application with the new interface isolating the functionality to be removed.
(7) When testing is successful, remove the old code and insert the new subsystem.
(8) Test the old application with the new subsystem inserted.
(9) If there are more new subsystems to implement, go to (2).
Partition new application. This determines the overall structure of where one wants to be with the new application when all of the old software is fully replaced. If one applies good OO practice to this partitioning (see the blog category on application partitioning), one will have a stable skeleton throughout the replacement process, however long it takes.
Select a subsystem for replacement. The key here is to select a subsystem from the new software rather than from the original. In effect with this approach one is building a new application incrementally one subsystem at a time rather than replacing an old application piecemeal. That mindset is important because it ensures that one stays focused on where one wants to be rather than where one is now. Typically one would select a new subsystem based upon what functionality has the highest priority for replacement in the old subsystem. The mapping will not be 1:1 due to lack of cohesion in the original application, but it should be close enough to honor the priorities. Other criteria that can be applied are things like effort, technology, etc.
Implement and test the new code. Using good OO practice will ensure a disciplined subsystem interface so that the new subsystem can be fully tested functionally in complete isolation. Note that one is building the new application's software; the original software is irrelevant at this point. This also allows the requirements to be acquired from domain experts incrementally because the subsystem being implemented will have a narrowly defined subject matter and, consequently, a narrowly defined functionality. The incremental nature of the process ensures that the requirements are timely. [If domain experts are not available, one may have to resort to reverse engineering. However, that only determines what the requirements for the subject matter are, not how it is implemented.]
Locate the new functionality in the old code. Conceptually one simply goes through the old code with a HiLiter and "color codes" those old lines that are somehow encoded in the new subsystem. These are the lines that will ultimately be replaced by the new subsystem. It is important to separate this and the next step. That's because in an ideal situation the lines identified in this step will not be touched in the next step. Inevitably, though, things will sometimes blur and one needs this step's color coding as a baseline when backtracking. (This task is one situation where reverse engineering may actually be useful, especially at the Interaction Diagram level.)
Insert the new subsystem's interface into the old code. This step is where most of the project risk lies. Up to here we basically have routine new development. Now, though, we have to get dirty by making changes to the old code. That will not be easy because the old code will not be as modular as the new code, so the new interface will not map conveniently into the old code. For example, a common situation occurs when a single line in an old program unit is all that is "in" the new subject matter. Somehow one needs to put a wrapper around that line so that it can be accessed by the new interface. Inevitably some old code will have to be moved around, which is why we need to keep the color coding of Step (4) as a baseline when tests fail and we need to debug. However, moving code should be minimized in this step whenever possible through the use of "smart" Facade patterns, etc.
A caution. This step tends to be messy the first time one tries it. However, with experience one learns a number of tricks for inserting the interlace. One such trick is to think in terms of multiple interfaces. For example,
<-- Remaining code --> | Converter | New | Dispatcher | <-- Removed code --> Fragment1 Fragment2 | Facade | Intf | Facade | Fragment3 Fragment4
| | | | | | | | | | | | | | |----------------------->| | |----------->| | | | |-------->| | | | | |---------->| |------->| | | | | | | |----------------------->| | | | | | | |
where the new subsystem interface, New Intf, is bracketed by local Facade interfaces that actually talk to the code fragments in the original application. The Converter Facade is a necessary modification of the original code that will remain in the application. It is needed to coordinate functionality fragments that individually talked to fragments in the code that will be replaced. Dispatcher Facade does the same thing for fragments within the code to be removed but when the new subsystem is inserted, it will go away. In the example, Fragment3 and Fragment4 are elements of functionality that will be accessed via a single interface method in the new subsystem. But in the original subsystem they are in different locations, so Dispatcher Facade is used to coordinate that in the old code so testing in the next step can be executed. The New Intf dispatches the single message it would dispatch in the new subsystem to Dispatcher Facade, who splits it into two messages to accommodate the old code. Meanwhile Fragment2, who originally talked to Fragment3, and Fragment2, who originally talked to Fragment4, now individually talk to Converted Facade. Converter Facade then invokes some New Intf method in response.
Usually Converter Facade and Dispatcher Facade will be created quite locally (i.e., around particular old code fragments) to deal with individual syntactic mismatches with the new subsystem interface. In effect they are "glue" code. Note that the indirection overhead of Dispatcher Facade is only relevant for Steps (5) and (6) because it goes away when the new subsystem is inserted. The overhead for Converter Facade indirection will remain, though, because it is a prices of "fixing" the lack of cohesion of the old code without rewriting it. (When all the subsystems have been replaced, all of the Converter Facades will be gone because the new subsystems' interfaces will be talking directly as designed into the new application.)
Test the old system with the new interface. The idea here is to ensure that we have properly identified the code to excise from the old system. If the code was properly identified in Step (4) and the new interface was properly inserted in Step (5), the existing regression tests should Just Work. Of course they won't because the original code is a mess and it is hard to identify functionality and modify it. However, at this point the scope of change is very limited. So when we test at this level we should be able to diagnose the problems fairly quickly. And since we are only inserting a buffer interface into the code, any necessary changes should be fairly easy. That is, we are migrating incrementally from the old code to the new code and that should make the process much more manageable.
Excise the old code and insert the new subsystem. Excising the old code should be easy at this point because it is all now behind the interface that we inserted in Step (5) and we have our color coding from Step (4) as well. Similarly, once the old code has been excised inserting the new subsystem should be just a matter of linking because the old system already has the new subsystem's interface. Being able to do this easily was one of the goals of isolating well defined tasks in the preceding steps.
Test the old application with the new subsystem. Again, this should be quite simple. One employs the original regression suite and it should mostly Just Work because we already know that the new subsystem is functionally correct from Step (3) and we tested the interface that integrates it into the old application in Step (6). Again, it probably won't work completely because we probably added some requirements when defining the new subsystem. However, whatever test problems show up should be relatively easy to resolve.
This approach has a number of advantages that will tend to make it a predictable process. Of course the primary advantage is the incremental migration of change. The well-defined steps isolate almost all of the development (schedule) risk into Step (5). That is where the lack of maintainability of the old code rears its ugly head. That is why that step has been carefully limited to simply putting a buffer interface between the code identified in Step (4) and the rest of the application. That ensures that we can minimize the changes to the old code. Ideally we strive to avoid changing any of the old code. Though holding to the ideal will probably be impossible in some cases, those will be clearly identifiable and can be controlled. Minimizing the risk by reducing the number of balls in the air at any one time allows a predictable process for the whole replacement. In my experience this sort of project can be estimated and managed with very close to the same accuracy as original development -- something that is almost unheard of when using traditional techniques for legacy replacement.
The rest of the advantages are related to keeping one's eye on the goal of building a new system properly. Much of this approach is designed to downgrade what the old system does and how it does it. That's why this approach is really about migrating the new application's code to the old system rather than "upgrading" the old application. The main weakness of the traditional approaches to piecemeal replacement is that they tend to encourage migrating the mistakes of the past to the future. This approach pretty much eliminates that possibility because the replacement code is constructed exactly as if it were new development and only after the fact is it migrated into the old system.
The final point I would make concerns flexibility. In a properly partitioned OO application each subsystem usually has less than 30 classes at the OOA level. That provides pretty fine granularity for the replacement increments. So there is a lot of flexibility in trading off developer resources against delivery schedule. Better yet, if the resources are available it is usually possible for small teams to work in parallel on individual subsystems. That's because the only place where they even have an opportunity to step on each other's toes is in Steps (5) through (7) and then most of the problem in in Step (5).
This flexibility allows some interesting variations. As an extreme example, suppose one decides to commit to a full replacement all at once. It is unlikely that there will be sufficient resources to work on all of the new subsystems at once. So one could apply exactly this approach for the full replacement, doing a few subsystems at a time just as one would normally do in an original development. Steps (5) through (7) would be unnecessary but that effort could be justified as good insurance for two reasons. If the original requirements are not readily available, this approach would serve as a sanity check because each new subsystem would be tested against the original regression suite in situ in the existing system. Probably more important, the incremental nature of the development allows Management to shift priorities on resources if the schedule takes a hit. That is, shifting from full replacement to piecemeal replacement with fewer resources would only require changing the schedule. Without this approach the entire project would likely be canceled, wasting all the effort to date.