A software developer’s job is never done. They don’t always get their code right on the first time, and it doesn’t always look pretty at first glance. Developers have to constantly modify, improve, and clean up their codebase to make it more readable and maintainable.
“Software development is like writing a novel,” said Geoffrey Grosenbach, vice president of innovation at Pluralsight, a provider of online professional software development training. “You write a first draft and you can see where it is going, but it is not cleaned up yet so you go back and you edit that novel a little more, and a few more times until you have a completed book. Refactoring is this idea of treating software as a draft, going back through the code, rewriting it to do the same thing it was always doing, but improving it so it is more useful to oneself and to other people who are going to work on it later.”
Developers want code to be pretty easy to work with and pretty easy to understand, but as code gets older and older, they are forced to revisit and update that code. Software is increasingly becoming more important to the businesses as a whole regardless of the industry, and developers need to find ways to take advantage of their existing apps and codebases, according to Jay Schmelzer, director of product management for Visual Studio at Microsoft.
Those who embark on a refactoring journey will have to understand the business goal they are trying to achieve, be able to detect when refactoring is necessary, and have the skillsets to successfully complete a code transformation.
When should you refactor?
Refactoring helps reduce the unessential complexity and get down to the essence of your code, but how do you know when your code includes that unnecessarily complexity? If you are reading your code and it takes you more than four seconds to understand exactly what the code does, then you need to refactor it, according to Arlo Belshee, senior legacy code consultant for Industrial Logic, a software development training provider.
“It is a read-time thing,” he said. “If you ever find yourself reading code instead of scanning it, it needs refactoring because you should always be able to scan it.”
(Related: What to do before refactoring your monolith into microservices)
According to Belshee, a developer’s largest source of frustration comes from not being able to get a computer to do what they want it to. Often, developers get overwhelmed with this frustration, and there is an appeal to starting from scratch. Grosenbach said developers should resist this urge. “Software takes a lot of work and a lot of coordination with a lot of people; if we just throw it out, we lose all the knowledge we gathered when we wrote that software,” he said.
“Starting from scratch costs a huge amount of money, it is very risky, and you are going to end up with huge amounts of bugs that you have to work through. If you take existing software, refine it, improve it, make it more secure and more user friendly, then it is actually a lot cheaper to do that.”
Developers shouldn’t refactor just to refactor, according to Schmelzer. “It is really a matter of when is refactoring a tool for accomplishing the business goal or satisfying the business need that you are being asked to address,” he said. “It is really a matter of ensuring what the value you are trying to get out of doing that work. If you don’t know the value you are trying to achieve, then I would question doing it.”
More often, businesses turn to refactoring as a way to modernize their existing applications. According to Schmelzer, businesses across all industries are being asked to extend the reach of their software to new scenarios and new use cases to provide new value. “To do that you need to restructure the code to expose things in new ways, and that is where techniques like refactoring start to come in and become very beneficial because it is a more structured way of thinking about how you reorganize and change the code without impacting the behavior of the application,” he said.
An example of this is the travel industry. In the past, customers had to directly call travel agents to get help with booking. With all the new technologies and devices today, customers can book directly through a reservation system. Businesses in the travel industry had to take their existing systems and modernize them to meet customers’ demand for new interactions. “You want the same logic to be there. You want to ensure all the logic around calculating fare, connecting flights, and flight availability is there and consistent,” said Schmelzer.
“Most likely the system wasn’t originally designed in a way that you could easily plug in new entry points into that logic, so developers have to go and start a refactoring operation to expose that core central logic in a way that is now accessed from multiple different experiences.”
These legacy apps also accumulate a tremendous amount of technical debt over time, leading to code breaks and unnecessarily hurdles in your workflow. Similarly to financial debt, technical debt incurs interest, but instead of in the form of money, it comes in the form of added work for developers. Technical debt often occurs when developers are trying to cut corners or aren’t trained properly, which causes code to become sloppy and hard to work with.
Refactoring can help eliminate technical debt by making a dirty design “clean,” which in turn helps speed up the process of future development. “Refactoring allows me to explicitly explore code, reach into code, hear what it is trying to say about the problem and business domain, reflect from what is there, and then find ways to make that similar, more expressive and as a result dramatically reduce errors, and reduce the complexity of implementing work,” said Industrial Logic’s Belshee.
In general, refactoring should happen when the opportunity calls for it, according to Hadi Hariri, developer advocacy team lead for JetBrains. “If I’m working on some code, and I see something can be simplified and is called for, then refactor. The notion of delaying refactoring for some magical moment in time doesn’t work that well,” he said.
How often should you refactor?
There are two ways to refactor your code: at a micro level and at a larger level, according to Microsoft’s Schmelzer.
At a micro level, developers refactor their code on a daily basis, whether it is changing a variable name or method name, or moving the location of a functionality to a more logical location as the codebase evolves. “At this level, developers are really motivated primarily around understanding and making it easier for themselves and other engineers on the team to see what the code is doing and be able to walk into it and contribute quickly,” said Schmelzer.
At a larger level, the entire project needs to be rethought (like how it was for the travel industry). This type of architectural refactoring tends to happen more on a product cadence, according to Schmelzer.
Large-scale refactoring is also being seen more with the recent movement toward microservices, forcing developers to change their thinking (and code) from one monolithic application to a dozen or more apps that are focused on separate tasks, according to Pluralsight’s Grosenbach.
He added that the best way to refactor is to do it in smaller increments, and to start as early as possible to make it easier for developers going forward. “The biggest thing to note is that refactoring gets harder the longer you put it off, so it is better to do it often and better to have it part of your regular workflow,” he said.
What are some best practices?
Once businesses and teams understand the value they are going to get from refactoring, there are some best practices experts recommend they follow.
Know the basics: Understand what refactoring is and how to do it before you even attempt it. Also, understand exactly what the code is doing before you start changing it. “It is important to step back and spend time reading the code, thinking about it, understanding it, and making sure you have a good grip on what is going on before you charge in and start modifying things,” said Grosenbach.
Have empathy: It might sound like a strange skill for software development, but Grosenbach says having empathy allows the developer to get out of his or her own head and look at the code as if he or she weren’t the person who wrote it. “Can I think about how these statements of code are going to be read and understood by other people?” he said.
Be curious: Developers should have the desire to poke around and figure out how a codebase works. “It’s an unwillingness to do something the dumb way just because you already know that way,” said Industrial Logic’s Belshee.
Be familiar with the language: Developers should know the basic idioms and programming constructs of their language. “This doesn’t need to be extensive; reflective design will teach it to you as you go,” said Belshee.
Do it manually: Refactoring is a human endeavor that requires a lot of thought, creativity and a subjective approach, according to Grosenbach. “You can’t ask a computer to rewrite an English sentence so that it is more poetic, readable and better-sounding,” he said. “It definitely involves humans.”
Ideally, Grosenbach says the person who wrote the code in the first place should be the one refactoring because it is the fastest path from the intent of code to the concise, clear communication of what the code does. “Often this is why big rewrites happen because developers who did not write the code originally get in there and try to figure out what it does and are completely lost. They can’t figure it out because it hasn’t been refactored very well or hasn’t been written very well in the first place, so people throw up their hands and decide to abandon it or just start from scratch,” he said.
According to Belshee, people often think of design as a write-time activity where they write new code and refactor it to make it clean. “Read time is just when it is a no-brainer and makes sense. If you have the right tools and approach, any time you are trying to read code and figure out what it does, the cheapest way to read it is to refactor it to make it legible and then read the legible thing,” he said.
“To read code, you do a simple loop: Read something, have an insight, write it down, report. As you understand small things then you can build understanding of large things.”
Having tests from the beginning ensures you are starting from a good known state and that your application is functioning correctly. “As you start doing your refactoring, you can run those tests, and as long as everything is turning green, you have the confidence that you haven’t adversely affected the application,” said Microsoft’s Schmelzer.
If you aren’t the developer who wrote the code, or if you are unsure if the code will make sense to other people, get a second opinion. “It’s good to sometimes have a peer review code if you don’t understand it or feel it’s too complicated, before embarking on refactoring,” said JetBrains’ Hariri. “Chances are, if you don’t fully comprehend the code, refactoring won’t make things better.”
Going beyond code reviews, developers can actually work in a pair-programming environment, according to Belshee. “It is possible to get sidetracked, especially in solo work. Pairs are less prone to get sidetracked because they keep each other on topic. But when you are working by yourself and trying to understand code, it is pretty easy to slip from a starting state of ‘I am trying to understand this code so that I can make this change,’” he said.
It is important to keep it as simple as you can. According to Belshee, developers often think skills such as test-driven development, advanced design skills, architectural awareness and knowledge of mocking are necessary for refactoring, but they are not needed and developers shouldn’t overcomplicate things for themselves with skills they don’t need.
How do tools come into play?
While many experts echo the same sentiment that tests should be in place to help developers refactor, Belshee believes there is a different approach.
“The predominant method is, we will write a whole bunch of tests around the code to guarantee all of its behavior, and then we are free to edit the code however we want because those tests will assure we didn’t change the behavior,” he said. “The problem with that method is it is pretty extensive to write all those tests first, and it is still prone to error because the only mistakes you will catch are the ones you thought to look for, the ones you wrote the test for, and all the unknowns will bite you.”
The manual way is also extremely susceptible to mistakes because it involves a large amount of copying and pasting, moving lines of code up and down and back and forth between different files. “When that is being done manually is it easy to paste something in the wrong place, forget you copied something, and you miss it and it fails to appear in the other place where it was supposed to be,” said Pluralsight’s Grosenbach.
Belshee believes an automated transformation approach that involves automated and assisted code transformation tools is the best and fastest way to do code refactoring. “If you do the automated transformation, and you use a set that only allows true refactoring, then when you are executing a transformation, your IDE is statically analyzing the code and guaranteeing that it is only going to give you the transformations that have exactly the same behavior and you know you are bug-to-bug compatible,” he said. “Those transformations take about a quarter of a second to do where doing that much transformation by hand would be a two-minute task once you are an expert at it, a five to 10 minute task when you aren’t an expert.”
To achieve this type of automated transformation, Belshee says developers should be looking for a fully automated tool in an IDE that has full static awareness and has the core set of refactoring values. According to Belshee, there are 11 refactorings that come in two sets: the core six and the key five. The core six includes a set of refactoring that allows developers to name concepts to make code legible and understandable. They include rename, introduce method, introduce variable, introduce parameter, introduce field, and inline. The key five include refactorings that developers need to use to make code testable, and they include introduce parameter object, move method, make method static, convert to instance method, and spilt class.
“You can accomplish nearly any design change using combinations of only these key 11,” said Belshee. “There are over 100 named refactorings, but the rest handle edge cases.”
In addition, JetBrains’ Hariri says tools should provide more than just search and replace; they should be able to detect potential issues and be able to inspire confidence in developers.
“Nothing is worse than having a tool you don’t trust when refactoring, because then you end up doing a half job,” he said. “Tooling plays an important role in removing the grunt work and avoiding mistakes. As humans we tend to make mistakes, especially when we repeat routine patterns. We become careless. Having tools to do this kind of work for us is much more efficient.”
Whether developers decide to use a tool or not, Grosenbach believes refactoring is an important skill developers should learn and take throughout their entire career.
“A lot of people have this concept that engineers will just come up with an idea and it is perfect and works well,” he said. “That is not the way it works. There are a lot of tweaks and improvements throughout the process, and refactoring is at the heart of it.
“To me, it is an extremely important skill, extremely important for the business because if you have a codebase that isn’t easy to understand or not something a developer can easily work on and improve, then you are just throwing [away] tens of thousands or even millions of dollars on your investment in technology and software.”