CSC/ECE 517 Fall 2013/ch1 1w07 d: Difference between revisions
No edit summary |
|||
Line 116: | Line 116: | ||
*Classes have more than 1000 lines of code. | *Classes have more than 1000 lines of code. | ||
*The code possesses a high cyclomatic complexity and volatility over time. | *The code possesses a high cyclomatic complexity and volatility over time. | ||
==Further Reading== | |||
[http://www.refactoring.com/catalog/index.html A list of refactoring techniques.] | |||
[http://en.wikipedia.org/wiki/Code_smell Identifying problems in code.] | |||
==References== | ==References== | ||
<references/> | <references/> |
Revision as of 02:01, 19 September 2013
This page discusses several common metrics used in determining when to refactor code.
Background
The practice of code refactoring deals with changing the content or structure of code without changing the code's function in its execution. Code refactoring has become a standard programming practice, as it potentially promotes readability, extensibility, and reusability of code.
Whether done through an IDE or by hand, large-scale code projects can prove tedious to refactor. If minimal non-functional benefits are achieved through refactoring, time is wasted. Furthermore, if not done properly, code refactoring can actually break the functionality of the code. In the extreme case, code could be structured so badly that starting over completely may be more viable than refactoring. As such, it is important to be able to know when and what to refactor.
Metrics
There are a variety of metrics that are used to quantify the merits of refactoring. Nearly all of these metrics can be calculated by static analysis tools. The final metric mentioned could be calculated by some version control system.
Complexity
In general, complexity is a measure of the number of branches and paths in the code.
Cyclomatic complexity, in particular, is a popular metric for measuring a method's complexity. In its simplest form, cyclomatic complexity can be thought of as adding 1 to the number of decision points within the code <ref> http://www.codeproject.com/Articles/13212/Code-Metrics-Code-Smells-and-Refactoring-in-Practi</ref>. These include cases in switch statements, loops, and if-else statements. In the following example, there are two decision points (the if and the else), so the cyclomatic complexity is 2+1=3.
public void evaluate(condition) { if(condition a) { //do something } else { //do something else } }
Cyclomatic complexity values are divided into tiers of risk, where values less than 11 are of low risk, values between 11 and 21 are of moderate risk, values between 21 and 51 are of high risk, and values greater than 50 are of extreme risk <ref>http://www.klocwork.com/products/documentation/current/McCabe_Cyclomatic_Complexity</ref>. For a method in the first tier (e.g. with a cyclomatic complexity of 10), refactoring may not be necessary. Conversely, for a method in the extreme risk tier (e.g. with a cyclomatic complexity of 2,000), throwing out the code and starting over may be the appropriate solution instead of refactoring. For a method with a cyclomatic complexity in one of the middle tiers, refactoring is likely the best option. The example below shows how refactoring can reduce the cyclomatic complexity of a method.
public void evaluateAll() { for(int n = 0; n < conditionList.size(); n ++) { if(conditionList.get(n).equals(a)) { //do something } else { //do something else } } }
Using extract method on this simple example, the original method can be reduced to the following two methods.
public void evaluateAll() { for(int n = 0; n < conditionList.size(); n ++) { evaluate(conditionList.get(n)); } }
public void evaluate(condition c) { if(c.equals(a)) { //do something } else { //do something else } }
In this manner, the cyclomatic complexity of the original method is reduced. Intuitively, choosing code with higher cyclomatic complexity yields a better-quality refactoring.
Duplicate Code
The metric of duplicate code is the number of times similar code structures are detected in a program. For example, although they are used to print different statements, the following code fragments contain similar structures.
System.out.println("Player position: " + player.position[0] + " " + player.position[1] + " " + player.position[2]);
System.out.println("Enemy velocity: " + enemy.velocity[0] + " " + enemy.velocity[1] + " " + enemy.velocity[2]);
If they were unified into a single method, both reusability and extensibility of the code's function would be improved. Once again using the extract method operation (or extract class, depending on the context), a new method can be created to facilitate the function of both previous code fragments. This is shown below.
public void printAttribute(String description, float attribute[]) { System.out.println(description + ": " + attribute[0] + " " + attribute[1] + " " + attribute[2]); }
Using this unified method in the respective locations of the original code fragments allows function to remain unchanged.
printAttribute("Player position", player.position);
printAttribute("Enemy velocity", enemy.velocity);
In practice, no code structure should exist in more than one location <ref>http://sourcemaking.com/refactoring/duplicated-code</ref>. As such, addressing code with the most duplicate structures yields the highest quality of refactoring.
Method and Class Length
The length of both methods and classes is also used as a metric to determine the potential quality of refactoring. Unlike the previous two metrics, however, the length of a method or class does not directly imply a problem in code design. Rather, the greater length that a method or class has likely implies that there may be another issue that can be quantified in other metrics (such as high complexity and duplicate code)<ref>http://blog.codeclimate.com/blog/2013/08/07/deciphering-ruby-code-metrics/</ref>. Method and Class lengths are not only calculated through the number of non-commented lines of code within each, but also through the number of members, parameters, and variables, as well as imports <ref>http://trijug.org/downloads/refactor-metrics.pdf</ref>.
In practice, classes with more than 1,000 lines of code and methods with more than 100 lines of code should be refactored.
Volatility
The metric of volatility is a measure of the number of changes in code over time <ref>http://www.grahambrooks.com/blog/metrics-based-refactoring-for-cleaner-code/ </ref>. Although by itself, volatility is not a very useful metric to use in refactoring, it attains such merit when plotted against complexity. The relationship between volatility and complexity is shown in the table below.
Low Volatility | High Volatility | |
---|---|---|
High Complexity | Low Priority | High Priority |
Low Complexity | No Priority | No Priority |
As shown in the table, code with a low volatility and low complexity does not need to be refactored. The same is true for code with a high volatility and low complexity, as there is little potential for future problems in the code. In regards to code with a high complexity but low volatility, refactoring should occur, but it is not of the highest priority. Code that has both a high volatility and high complexity, however, is assigned the highest priority.
Review of Metrics in Practice
In review, the question of when and what to refactor can be determined by prioritizing refactoring based upon the highest potential quality to be gained through the refactoring. In regards to the metrics presented above, the priority of refactoring code should increase when:
- Methods in the code have a high cyclomatic complexity.
- There exist multiple duplicate code structures.
- Methods have more than 100 lines of code.
- Classes have more than 1000 lines of code.
- The code possesses a high cyclomatic complexity and volatility over time.
Further Reading
A list of refactoring techniques. Identifying problems in code.
References
<references/>