CSC/ECE 517 Spring 2014/ch1a 1w1e rm
This page discusses how to elegantly refactor code, including several common metrics used in determining the potential quality of refactoring code, as well as which refactoring techniques to use in coordination with such metrics.
Background
The practice of code refactoring deals with changing the content or structure of code without changing the code's function in its execution. Code refactoring has become a standard programming practice, as it potentially promotes readability, extensibility, and reusability of code.
Whether done through an IDE or by hand, large-scale code projects can prove tedious to refactor. If minimal non-functional benefits are achieved through refactoring, time is wasted. Furthermore, if not done properly, code refactoring can actually break the functionality of the code. In the extreme case, code could be structured so badly that starting over completely may be more viable than refactoring. As such, it is important to be able to know when and what to refactor.
Refactoring Techniques
There exist a large amount of refactoring techniques that are used to improve code quality. Since a few specific refactoring techniques are mentioned in this page in conjunction with metrics, this section describes such techniques in terms of their implementation.
Extract Method
The extract method refactoring technique is used when fragments of code within a method can be grouped together to form a separate method <ref>http://www.refactoring.com/catalog/extractMethod.html</ref>. An example of extract method is shown below.
public void mod2(int value) { boolean boolValue = true; if(value < 0) { while(value < 0) { value += 2; } boolValue = false; } value = value % 2; System.out.println("Was the value greater than zero? " + boolValue); System.out.println("The method returned " + value); }
In this simple example, the print statements could be grouped together in a new method to form the following:
public void printInfo(boolean, boolValue, int value) { System.out.println("Was the value greater than zero? " + boolValue); System.out.println("The method returned " + value); }
The original method, then, would become
public void mod2(int value) { boolean boolValue = true; if(value < 0) { while(value < 0) { value += 2; } boolValue = false; } value = value % 2; printInfo(boolValue, value); }
Extract Class
The extract class refactoring technique is used when a class can be split up into two separate classes <ref>http://www.refactoring.com/catalog/extractClass.html</ref>. Such an example is shown below.
class Person { // personal details private String name; // vehicle details private String vehicleMake; private String vehicleModel; private int vehicleYear; // ... more variables, methods, etc }
Notice that the vehicle details in the Person class could be placed in a brand new class, like so:
class Vehicle { // vehicle details private String vehicleMake; private String vehicleModel; private int vehicleYear; // ... more variables, methods, etc. specific to Vehicle class }
The person class would become
class Person { // personal details private String name; private Vehicle car; // ... more variables, methods, etc }
Pull Up Method
The pull of method refactoring technique is used when a method existing in multiple subclasses of a class can be moved into the parent class <ref>http://www.refactoring.com/catalog/pullUpMethod.html</ref>. An example of this is shown below.
class Car extends Vehicle { private float miles; // ... other variables, methods, etc. public float getMiles() { return Car.miles; } }
class Truck extends Vehicle { private float miles; // ... other variables, methods, etc. public float getMiles() { return Truck.miles; } }
In this example, the Car and Truck classes are both subclasses of the Vehicle class, and both have a method with similar structure (getMiles()). As such, using the pull up method technique, the getMiles() method can be removed from the Car and Truck classes and placed in the Vehicle class, like so:
class Vehicle { private float miles; // ... other variables, methods, etc. public float getMiles() { return Vehicle.miles; } }
Note that in this particular example, the float variable of miles is also pulled into the Vehicle class, so the pull up field refactoring technique is also used <ref>http://www.refactoring.com/catalog/pullUpField.html</ref>.
Metrics
There are a variety of metrics that are used to quantify the merits of refactoring. Nearly all of these metrics can be calculated by program analysis tools. The final metric mentioned could be calculated by some version control system.
Complexity
In general, complexity is a measure of the number of branches and paths in the code.
Cyclomatic complexity, in particular, is a popular metric for measuring a method's complexity. In its simplest form, cyclomatic complexity can be thought of as adding 1 to the number of decision points within the code <ref> http://www.codeproject.com/Articles/13212/Code-Metrics-Code-Smells-and-Refactoring-in-Practi</ref>. These include cases in switch statements, loops, and if-else statements. In the following example, there are two decision points (the if and the else), so the cyclomatic complexity is 2+1=3.
public void evaluate(condition) { if(condition a) { //do something } else { //do something else } }
Cyclomatic complexity values are divided into tiers of risk, where values less than 11 are of low risk, values between 11 and 21 are of moderate risk, values between 21 and 51 are of high risk, and values greater than 50 are of extreme risk <ref>http://www.klocwork.com/products/documentation/current/McCabe_Cyclomatic_Complexity</ref>. For a method in the first tier (e.g. with a cyclomatic complexity of 10), refactoring may not be necessary. Conversely, for a method in the extreme risk tier (e.g. with a cyclomatic complexity of 2,000), throwing out the code and starting over may be the appropriate solution instead of refactoring. For a method with a cyclomatic complexity in one of the middle tiers, refactoring is likely the best option. The example below shows how refactoring can reduce the cyclomatic complexity of a method.
public void evaluateAll() { for(int n = 0; n < conditionList.size(); n ++) { if(conditionList.get(n).equals(a)) { //do something } else { //do something else } } }
Using extract method on this simple example, the original method can be reduced to the following two methods.
public void evaluateAll() { for(int n = 0; n < conditionList.size(); n ++) { evaluate(conditionList.get(n)); } }
public void evaluate(condition c) { if(c.equals(a)) { //do something } else { //do something else } }
In this manner, the cyclomatic complexity of the original method is reduced. Intuitively, choosing code with higher cyclomatic complexity yields a better-quality refactoring.
Duplicate Code
The metric of duplicate code is the number of times similar code structures are detected in a program. For example, although they are used to print different statements, the following code fragments contain similar structures.
System.out.println("Player position: " + player.position[0] + " " + player.position[1] + " " + player.position[2]);
System.out.println("Enemy velocity: " + enemy.velocity[0] + " " + enemy.velocity[1] + " " + enemy.velocity[2]);
If they were unified into a single method, both reusability and extensibility of the code's function would be improved. Once again using the extract method operation (or extract class, depending on the context), a new method can be created to facilitate the function of both previous code fragments. This is shown below.
public void printAttribute(String description, float attribute[]) { System.out.println(description + ": " + attribute[0] + " " + attribute[1] + " " + attribute[2]); }
Using this unified method in the respective locations of the original code fragments allows function to remain unchanged.
printAttribute("Player position", player.position);
printAttribute("Enemy velocity", enemy.velocity);
The extract method solution to refactoring code is not the only example of how to refactor duplicate code structures. For example, within a single class, the printAttribute method might might solve local duplication issues, but issues could still exist outside of the class. For example, such a method could be defined in two related classes, as shown below.
class Enemy extends MovingObject { // ...non-duplicate variables for Enemy class // ...non-duplicate methods for Enemy class public void printAttribute(String description, float attribute[]) { System.out.println(description + ": " + attribute[0] + " " + attribute[1] + " " + attribute[2]); } }
class Player extends MovingObject { // ...non-duplicate variables for Player class // ...non-duplicate methods for Player class public void printAttribute(String description, float attribute[]) { System.out.println(description + ": " + attribute[0] + " " + attribute[1] + " " + attribute[2]); } }
In the case of these two classes, printAttribute is shared by both. Assuming all subclasses of MovingObject contain some float array of size three, implementing the pull-up method refactoring pattern would move the printAttribute into the MovingObject class. Thus, the duplication issue would be resolved.
In practice, no code structure should exist in more than one location <ref>http://sourcemaking.com/refactoring/duplicated-code</ref>. As such, addressing code with the most duplicate structures yields the highest quality of refactoring.
Method and Class Length
The length of both methods and classes is also used as a metric to determine the potential quality of refactoring. Unlike the previous two metrics, however, the length of a method or class does not directly imply a problem in code design. Rather, the greater length that a method or class has likely implies that there may be another issue that can be quantified in other metrics (such as high complexity and duplicate code)<ref>http://blog.codeclimate.com/blog/2013/08/07/deciphering-ruby-code-metrics/</ref>. Method and Class lengths are not only calculated through the number of non-commented lines of code within each, but also through the number of members, parameters, variables, and imports <ref>http://trijug.org/downloads/refactor-metrics.pdf</ref>.
In practice, classes with more than 1,000 lines of code and methods with more than 100 lines of code should be refactored.
Volatility
The metric of volatility is a measure of the number of changes in code over time <ref>http://www.grahambrooks.com/blog/metrics-based-refactoring-for-cleaner-code/ </ref>. Although by itself, volatility is not a very useful metric to use in refactoring, it attains such merit when plotted against complexity. The relationship between volatility and complexity is shown in the table below.
Low Volatility | High Volatility | |
---|---|---|
High Complexity | Low Priority | High Priority |
Low Complexity | No Priority | No Priority |
As shown in the table, code with a low volatility and low complexity does not need to be refactored. The same is true for code with a high volatility and low complexity, as there is little potential for future problems in the code. In regards to code with a high complexity but low volatility, refactoring should occur, but it is not of the highest priority. Code that has both a high volatility and high complexity, however, is assigned the highest priority.
Review of Metrics in Practice
In review, the question of when and what to refactor can be determined by prioritizing refactoring based upon the highest potential quality to be gained through the refactoring. In regards to the metrics presented above, the priority of refactoring code should increase when:
- Methods in the code have a high cyclomatic complexity.
- There exist multiple duplicate code structures.
- Methods have more than 100 lines of code.
- Classes have more than 1000 lines of code.
- The code possesses a high cyclomatic complexity and volatility over time.
Further Reading
References
<references/>