CSC/ECE 517 Fall 2009/wiki3 2 pp: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
Line 8: Line 8:
There are a good number of clone detection tools available both commercially and within academia. Within these tools several different approaches to
There are a good number of clone detection tools available both commercially and within academia. Within these tools several different approaches to
software clone detection have been implemented, including [http://www.cs.ucsb.edu/~yuf/stranger.pdf string analysis], [http://en.wikipedia.org/wiki/Program_slicing program slicing], [http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=565012&isnumber=12288&punumber=4204&k2dockey=565012@ieeecnfs&query=((automatic+detection+of+function+clones+in+a+software+system+using+metrics)%3Cin%3Emetadata)&pos=0&access=no metric analysis] and [http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=738528&isnumber=15947&punumber=5960&k2dockey=738528@ieeecnfs&query=((clone+detection+using+abstract+syntax+trees)%3Cin%3Emetadata)&pos=0&access=no abstract tree comparisons]. This page will survey the a set of clone detection tools and compare them.
software clone detection have been implemented, including [http://www.cs.ucsb.edu/~yuf/stranger.pdf string analysis], [http://en.wikipedia.org/wiki/Program_slicing program slicing], [http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=565012&isnumber=12288&punumber=4204&k2dockey=565012@ieeecnfs&query=((automatic+detection+of+function+clones+in+a+software+system+using+metrics)%3Cin%3Emetadata)&pos=0&access=no metric analysis] and [http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=738528&isnumber=15947&punumber=5960&k2dockey=738528@ieeecnfs&query=((clone+detection+using+abstract+syntax+trees)%3Cin%3Emetadata)&pos=0&access=no abstract tree comparisons]. This page will survey the a set of clone detection tools and compare them.
=<font color="windowtext">Clone Detection Technique</font>=
This article will be primarily focusing on below five established detection tools; [https://www.ipd.uni-karlsruhe.de/jplag/ JPlag], [http://theory.stanford.edu/~aiken/moss/ MOSS], [http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=565012&isnumber=12288&punumber=4204&k2dockey=565012@ieeecnfs&query=((automatic+detection+of+function+clones+in+a+software+system+using+metrics)%3Cin%3Emetadata)&pos=0&access=no Covet], [http://www.ccfinder.net/ CCFinder] and [http://www.semdesigns.com/Products/Clone/ CloneDr]. JPlag and MOSS are web-based academic tools for detecting plagiarism in student's source code. CloneDr and CCFinder are stand alone tools looking at code duplication in general.
[[Image:comparison_table.JPG|frame|center|alt=Comparison Table|Figure 1 ''[[Comparison Table]]''.]]
Figure 1 summarizes the clone detection tools. The languages supported by the analysis process are highlighted, as is the analysis approach. The column labeled domain highlights the main purpose of the tools for either clone detection or for plagiarism detection.
==<span class="apple-style-span"><span style="mso-bidi-font-size: 12.0pt; line-height: 115%"><font color="windowtext">CCFinder</font></span></span>==
[http://ieeexplore.ieee.org.www.lib.ncsu.edu:2048/search/srchabstract.jsp?arnumber=919197&isnumber=19875&punumber=7340&k2dockey=919197@ieeecnfs&query=((maintenance+support+tools+for+java+programs:+ccfinder+and+jaat)%3Cin%3Emetadata)&pos=0&access=no CCFinder] focuses on analyzing large-scale systems with a limited amount of language dependence. It transforms the source code into tokens. CCFinder aims to identify "portions of interest (but syntactically not exactly identical structures)". After the string is tokenised a token-by-token matching algorithms is performed. CCFinder also provides a dotplotting visualisation tool that allows visual recognition of matches within large amounts of code.


=<font color="windowtext">References</font>=
=<font color="windowtext">References</font>=

Revision as of 21:37, 15 November 2009

Clone detection and clone manipulation

The DRY principle says that a particular code fragment should not be repeated more than once in a program. But it happens. And when it does, it is good to be able to find the multiple code "clones" because Software cloning complicates the maintenance process by giving the maintainers unnecessary code to examine. As per Burd, it seems that when presented with the challenge of adding new functionality the natural instinct of a programmer is to copy, paste and modify the existing code to meet the new requirements and thus creating a software clone. While the basis behind such an approach is uncertain, one possible reason is due to time restrictions on maintainers to complete the maintenance change. Duccase points out that “making a code fragment is simpler and faster than writing from scratch” and that if a programmer’s pay is related to the amount of code they produce then the proliferation of software clones will continue.

Once a clone is created it is effectively lost within the source code and so both clones must therefore be maintained as separate units despite their similarities. Komondoor states that if errors are identified within one clone then it is likely that modifications may be necessary to the other counter-part clones. Detection is therefore required if any of the clones are to be re-identified to assist the maintenance process. So if clones can be detected then the similarities can be exploited and replaced during preventative maintenance with a new single code unit this will eliminate the problems identified above.

There are a good number of clone detection tools available both commercially and within academia. Within these tools several different approaches to software clone detection have been implemented, including string analysis, program slicing, metric analysis and abstract tree comparisons. This page will survey the a set of clone detection tools and compare them.

Clone Detection Technique

This article will be primarily focusing on below five established detection tools; JPlag, MOSS, Covet, CCFinder and CloneDr. JPlag and MOSS are web-based academic tools for detecting plagiarism in student's source code. CloneDr and CCFinder are stand alone tools looking at code duplication in general.

Comparison Table
Figure 1 Comparison Table.

Figure 1 summarizes the clone detection tools. The languages supported by the analysis process are highlighted, as is the analysis approach. The column labeled domain highlights the main purpose of the tools for either clone detection or for plagiarism detection.

CCFinder

CCFinder focuses on analyzing large-scale systems with a limited amount of language dependence. It transforms the source code into tokens. CCFinder aims to identify "portions of interest (but syntactically not exactly identical structures)". After the string is tokenised a token-by-token matching algorithms is performed. CCFinder also provides a dotplotting visualisation tool that allows visual recognition of matches within large amounts of code.

References

[1] Evaluating clone detection tools for use during preventative maintenance

[2] Investigating the maintenance implications of the replication of code

[3] A Language Independent Approach for Detecting Duplicated Code

[4] Using Slicing to Identify Duplication in Source Code

[5] Experiment on the automatic detection of function clones in a software system using metrics

[6] Clone detection using abstract syntax trees

[7] Maintenance Support Tools for JAVA Programs: CCFinder and JAAT