The Code Handyman

If it ain't broke, well, you might try refactoring to make sure your systems remain that way.


One of the great joys of software development is that you get to build things out of thin air and imagination (with a little help from a compiler). You start with a white board, design some elaborate data structures and then code away, watching your masterwork form.

But much as we'd all like to spend our days writing exciting new code, the reality is a bit different. Most coding is maintenance programming. One brilliant programmer puts together the core of the system and then 500 of us spend the rest of our careers maintaining it. But don't worry, there's still room for a lot of craft in the maintenance programming world. One interesting development in recent years is the rise of refactoring as an accepted part of that craft.

It Doesn't Rust, But…
Joel Spolsky, a developer and the author of the Joel on Software Weblog, has pointed out the general insanity involved in throwing away working code and starting from scratch when you need to change something. He argues that this is the main reason why Microsoft managed to eat Netscape's lunch in the browser wars—Netscape wasted years on replacing their Navigator code with a whole new version. (See "Netscape Goes Bonkers," at http://www.joelonsoftware.com/articles/fog0000000027.html.) Yeah, Mozilla is shaping up to be a nice product in some ways, but there aren't a whole lot of people left who care any more. Joel points out that source code doesn't rust.

Developer Central Newsletter
Want to read more of Mike's work? Sign up for the monthly Developer Central e-newsletter, including product reviews, links to web content, and more, at http://lists.101com.com/
NLS/pages/main.asp?NL=
mcpmag&o=developer
.

But although source code doesn't rust (in the sense that code which works today is still going to work tomorrow), requirements do. We could debate for a while whether most requirement changes are the result of business changes or just the equivalent of bigger tail fins on this year's cars (is the Windows XP look-and-feel technology or marketing?), but the fact is that developers are constantly called upon to fix, tweak, and otherwise change working code.

It's precisely because code doesn't rust that these changes take up so much of our development energies. Having paid us tons of money to develop software in the first place, the people who own it understandably would like to maximize their investments. So they ask not for a rewrite but for more characters in the country name field (who knew they were going to start doing business in Bashkortostan?) or a different color on the Web page (because the new CEO doesn't like pastels).

Surely there must be a way to respond to these demands and still keep from going mad with boredom.

Enter Refactoring
That's where refactoring comes in. Martin Fowler, who wrote the standard reference on the subject (titled simply Refactoring) defines the term this way:

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the changes of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written.

If that sounds intriguing, you really ought to read Fowler's book (or check out his Web site http://www.refactoring.com for an introduction and some links). The book explains and justifies refactoring in some depth and then presents a catalog of refactorings. Historically, refactoring originated with Smalltalk. These days the concepts are most advanced in Java, and Fowler uses Java for his examples. That can make the book a bit hard to read in spots if you don't know the language, but the principles apply to any object-oriented language.

To give you some flavor for the book, here's one of the simplest of refactorings: If you find a public field, make it private and provide accessors. In Java, that's a change from:

public String _name

to

private String _name;
public String getName() {return _name;}
public void setName(String arg) {_name = arg;}

The real benefit to the catalog of refactorings is not in identifying them but in recording knowledge about them. Fowler justifies each one as it's introduced (in this case, discussing the benefits of private data to modularity) and then provides a "mechanics" section that discusses how to safely make the change. In the case of taking a field private, this includes finding and replacing all uses of the public field, then making the field private—and doing a full compile-and-test cycle after each change (which points out, by the way, one of the ways in which refactoring fits naturally into an Extreme Programming environment). For all but the simplest refactorings he also provides fully worked examples.

Explorers Need Not Apply
One of the key points to refactoring is that the changes you make when refactoring should be provably correct. That's where the part about not changing the external behavior of the code comes in. Remember, you're starting with code that works. You want to end up with code that works, regardless of the changes that you're making. This means that it's OK to make very structured changes, one at a time. Things like extracting a superclass from the common features of two similar classes don't change the interface of the existing classes at all; they just rearrange the internal implementation.

Refactoring isn't exploration. Though there is a place in software development for experiments along the lines of "what would happen if we ripped out this section here and replaced it with new objects written in C# and then changed the interfaces to match?", that place is not in the refactoring process. (In fact, the place for such radical changes is in a new branch of our source code control system, one that you can throw away if you realize that you've reached the point where the map says "Here There Be Tygers.") Refactoring is more like maintaining a garden: you take what's already there and make it neater, while still preserving the original design.

Key to this process is testing. You absolutely need a good set of unit tests that exercises any code that you intend to refactor. Even if you can prove to your own satisfaction that the refactoring changes don't change the behavior of the code, run the tests! Otherwise, the time will come when you're awakened at three in the morning because you neglected to think about some special case or other feature that you refactored out of existence.

Bait and Switch?
But wait a minute! I started out by talking about maintenance programming in response to changing requirements, and now I'm talking about refactoring that doesn't change anything. Why do all this work if it's not going to change anything? The answer is that refactoring doesn't change anything externally, but it certainly changes things internally. Take the example of deriving a superclass from two existing classes. Perhaps you have Customers and Suppliers who share many pieces of information, such as address, phone number, and fax number. So in refactoring you might create an Entity class containing the common fields, and derive the two existing classes from Entity. That's an internal change that makes the code significantly cleaner.

Of course, clean code doesn't excite the business folks the way it does the development folks. But think about the effect of this refactoring on future requirements changes: The next time the business folks come to you with a new piece of common information ("oh, can we add a Web site address to Customers? And to Suppliers, while we're at it?"), your job is much easier as a result of the refactoring. Now what looks like two changes to your customers is one to your code, thanks to clever refactoring.

And that's where the benefits of this approach to come in. If you're maintaining code, think about refactoring it at the same time. As you find ways to improve the internal structure so that it's more maintainable, take the time to make the improvements. If the code is in bad shape (perhaps your predecessor was not as brilliant as you are), you might need to keep a list of refactorings that you'd like to make. Then you can pull things off the list as you find time to work on them. Each refactoring is a tiny investment in future maintainability for your code. Just like making tiny investments in a money market fund, these can really add up over time.

State of the Art
You might have already made this leap, but just in case: refactoring is an obvious candidate for tool support. After all, if there's a provably—correct transformation (like turning public fields into private ones with accessors) and a recipe for executing the transformation, why can't the whole process be automated? The answer is that it can and that there are many tools (almost all in the Java or Smalltalk arena) that can perform specified refactorings on your code.

But why should the Java developers have all the fun? Visual Studio .NET has all of the elements that it could possibly need to support a good refactoring tool: complete code introspection and creation via System.Reflection, an add-in model that lets third parties integrate their work, and object-oriented languages to edit. So where are the .NET refactoring tools? Are you going to write the first one? If so, write and tell me about it!

Featured