CSC/ECE 517 Fall 2009/wiki1a 9 pp

From Expertiza_Wiki
Jump to navigation Jump to search

Research in refactoring tools

Refactoring is a very practical subject. A lot of the work in refactoring environments has been done on an ad hoc basis, by practicing programmers. This article briefly summarizes various initiatives that have been taken to improve the current generation of refactoring tools. Refactoring analysis requires serious research. All though the refactoring tools have been used presently for an automated refactoring, the refactoring itself is being performed informally from many years. The automatic refactoring is a result of academic researches being performed in this area since many years. This article will later summarizes the academic underpinning for some popular refactoring methods.

Introduction

Refactoring is a process of restructuring code without changing the way it behaves. Refactoring can be semi-automated with the help of tools, such as those that are integrated into the Eclipse environment. If these tools help programmers refactor with fewer errors, in less time, and with greater overall satisfaction than refactoring by hand, then programmers are likely to adopt such tools.

While the integration of refactoring tools into many development environments has increased, the usability of these tools has remained stagnant. Specifically, when refactoring fail, tools communicate the failure to the programmer poorly, causing the programmer to restructure slowly, conservatively, and without preserving behavior. There are various techniques developed for refactoring correctness, speed, and user satisfaction and improving the usability of refactoring tools. In the long run, better usability will aid in the adoption and utilization of refactoring tools.

Improvements in the Refactoring Tools

There are two main difficulties encountered by programmers while trying to refactor. First, on many occasions programmers are unable to select a list of statements, due to very long statements and inconsistent formatting. Second, is that the programmers have trouble understanding and interpreting error messages produced by the tools, such as “Ambiguous return value: selected block contains more than one assignment to local variable.” These error messages are generally caused by violated refactoring preconditions.

In order to make the tools user friendly there are several researches being performed.

Eclipse Refactoring Annotation Screen Shot
Figure 2Eclipse Refactoring Annotations.

Murphy-Hill [1] proposed add-ons to Eclipse such as SelectionAssist

Eclipse Selection Assist Screen Shot
Figure 1Eclipse SelectionAssist.

where when the cursor is placed in the whitespace in front of a statement, a green highlight appears over the text range of that statement (Figure 1). Another suggestion was using Refactoring Annotations which tells the programmer that one parameter must be passed into the extracted method, and that two values must be returned, violating an Extract Method precondition. Refactoring Annotations can also inform the programmer about precondition violations involving control flow (Figure 2). With experimental validations Murphy-Hill [1] found that the usability of refactoring tools increases significantly with these changes thus increasing programmers’ speed and correctness.

In the area of refactoring, usability-related research has focused on understandability of error messages and assisting the user in the selection of text with a mouse cursor prior to the application of an automated refactoring transformation. However there are other more important aspects affecting the usability of existing refactoring tools. To identify areas in which usability can be improved, Wyeth [12] developed usability requirements for refactoring tools and have used these requirements to analyse the state-of-the-art for refactoring tools. The usability requirements are categorized into Consistency, Errors, Information Processing, User Experience, Design for the User(s), User Control, Goal Assessment, and Ease of Use. Based on assessment of each requirement, usability analysis is performed and possible areas of improvizations are determined.

Academic underpinnings

Lexical Analysis [4], various Parsing [5] techniques, Regular Expressions [6] and Graph Theory [7] are the four most basic components used in the refactoring tools. Over last several years these components have matured due to ongoing research and provide extremely strong foundation for refactoring tools building. Below are some of the refactoring methods discussed which makes uses of these components.

Extract Method

Extract Method has been recognized as one of the most important refactorings, since it decomposes large methods and can be used in combination with other refactorings for fixing a variety of design problems. Extract method is simply observed as Cut and Paste operation but it is not indeed. Extract Method actually requires some serious work. The method needs to be analyzed, all the temporary variables need to be found and then the refactoring strategy needs to be applied. The analysis is not something which is simply done by regular expression manipulation but it requires some careful work.

The main problem with existing tools and methodologies is that they support extraction of methods based on a set of statements selected by the user in the original method. There are researches being performed in order to provide suggestions about Extract Method opportunities. Chatzigeorgiou [9] have proposed methodology to automatically identify Extract Method refactoring opportunities and present them as suggestions to the designer of an object oriented system. Such techniques require making uses of Program Slicing [2] techniques. Program Slicing was defined by Weiser [3] as a slice consists of all the statements in a program that may affect the value of a variable x at a specific point of interest p. The pair (p, x) is referred to as slicing criterion. In general, slices are computed by finding sets of directly or indirectly relevant statements based on control and data dependencies. There are vast majority of the papers found in the literature of method extraction are based on the concept of Program Slicing. After the original definition by Weiser [3] several other notions of slicing such as static and dynamic slicing, forward and backward slicing, syntax-preserving slicing and intra-procedural slicing have been proposed by various researchers.

Code cloning and detection

When a section of code is duplicated in more than one location among a collection of source files, that section of code and all of its duplicates are considered code clones. Clones are typically generated because a programmer copies a section of code and pastes it into other parts of the same program. Such clones can produce maintenance nightmares. In particular, if a section of code representing a clone is updated, then other sections of code representing clones from the same group may also need to be updated. This scenario creates a challenge for the maintainer to realize the existence of the clones and uniformly update all relevant sections of code. One approach to improve the maintainability of clones is the use of refactoring to eliminate the duplicate code through abstraction and improved modularity. In this case, the maintainability is improved because an update to the section of code is only required in one location. In order to determine clones that have potential for being refactored, analysis of the clones is required. Clone detection analysis makes use of:

String Matching – Represents and evaluates code using string comparisons. There are various types of string matching such as Exact String Matching, Parameterized Matching and Substring Matching.

Token Parsing – Transforms code into tokens by using language specific constructs into a single token string. Then finds the similarities within this token string and transforms token clones back into code clones for presentation.

Graph Matching – Machine representation of the code is formed using graphs and clones are identified as identical subgraphs.

Refactoring Annotations

Refactoring Annotations as discussed earlier could be applied to other software engineering tools, such as compilers. For example, type error messages are notoriously hard for beginners to understand and for experts to track down.

Ongoing and Future Research

Binary Refactoring

Binary Refactoring is a software engineering technique for improving the implementation of programs without modifying their source code. While related to regular refactoring in preserving a program’s functionality, binary refactoring aims to capture modifications that are often applied to source code, although they only improve the performance of the software application and not the code structure. Smaragdakis [10] proposed this concept of binary refactoring as a valuable software engineering technique. Binary refactoring is the application of refactoring transformations to a software application without affecting its source code. Thus, binary refactoring transformations are semantics-preserving (assuming correct application) and intended for performance optimization. Nevertheless, it is more appropriate to see binary refactoring as a technique that enhances maintainability without sacrificing performance, rather than as a technique for improving performance.

The reason of binary refactoring is not that current programs need more optimization but that programmers already apply code structure transformations for performance reasons, yet these transformations unnecessarily pollute the source code and affect its maintainability. For instance, classes are often split not because of design reasons but to exploit locality: one part of the object may need to be stored or transmitted independently of the rest. This source-code refactoring is common in server-side applications (e.g., with J2EE technology) in which objects need to be stored in databases and transmitted over the network. With binary refactoring, the class structure in the program can remain intact but a split class refactoring can produce the same performance benefit. As another example, source code modifications often are applied just to reduce indirection cost (e.g., by devirtualization, manual inlining, or the “remove middle man” source refactoring). In contrast, with binary refactoring all such transformations can be codified and applied through a refactoring browser without affecting the application source code. Binary refactoring is closely related both to regular software refactoring and to annotation-guided compiler optimization.

Refactoring Functional Programs

Refactorings are source-to-source program transformations which change program structure and organisation, but not program functionality. So far there are not any standard tool and methodologies developed to refactor the functionality. Documented in catalogues and supported by tools, refactoring provides the means to adapt and improve the design of existing code, and has thus enabled the trend towards modern agile software development processes. Refactoring has taken a prominent place in software development and maintenance, but most of this recent success has taken place in the OO and XP communities. Thompson [11] explored the prospects for Refactoring Functional Programs, taking Haskell as a concrete case-study however the work is still at initial stages and would require significant amount of research in future.

Language Independent Refactoring

There are many other areas which requires significant amount of research such as Language Independent Refactoring. Nierstrasz* [13] investigated the similarities between refactoring for Smalltalk and Java, derived a language-independent meta-model and shown that it is feasible to build a language independent refactoring engine on top of this meta-model. This could result is coming up with the language independent refactoring specification which could be implemented by language specific libraries.

Use Case Refactoring

Use case models are widely used for requirements engineering to capture functional and non-functional requirements, guide scenario-based design and validation and to manage the projects. Butler [14] have proposed a tool for use case development and evolution supports re-organization (refactoring) of use case models as well as the extension of use models to include new functional and non-functional requirements. This model is based on three level meta model covering the context of the use case model, the structure of use cases and event or message passing details of the scenario. This tool provides a new direction for refactoring as it could be used during any phase of Software Engineering process.

References

[1] Improving usability of refactoring tools Emerson Murphy-Hill, October 2006, ACM

[2] Program Slicing

[3] M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, vol. 10, no. 4, pp. 352-357, 1984.

[4] Lexical Analysis

[5] Parsing

[6] Regular Expressions

[7] Graph Theory

[8] Code Cloning Detection

[9] Identification of Extract Method Refactoring Opportunities Tsantalis, N.; Chatzigeorgiou, A. Software Maintenance and Reengineering, 2009. CSMR '09. 13th European Conference on IEEE, 24-27 March 2009 Page(s):119 - 128.

[10] Binary refactoring: improving code behind the scenes Eli Tilevich, Yannis Smaragdakis, ACM, 2005, Pages: 264 - 273

[11] Tool support for refactoring functional programs Huiqing Li, Claus Reinke, Simon Thompson, 2003, Pages: 27 - 38

[12] Improving Usability of Software Refactoring Tools Erica Mealy, David Carrington, Paul Strooper and Peta Wyeth, IEEE, April 2007 Page(s):307 - 318

[13] A Meta-model for Language-Independent Refactoring Tichelaar, S.; Ducasse, S.; Demeyer, S.; Nierstrasz, O.; IEEE, Nov 2000 Page(s):154 - 164

[14] Use case refactoring: a tool and a case study Xu, J.; Yu, W.; Rui, K.; Butler, G.; IEEE, Dec. 2004 Page(s):484 - 491