CSC/ECE 517 Spring 2014/ch1 1w1b np

Background

What is static analysis?

An important aspect of software code is to be bug free. Most code that passes tests still contains bugs in them. Once a software passes the required tests, it is necessary to check for bugs in the code and check for quality of the code. This checking of code to improve quality is called code analysis. There are two main categories of code analysis, static code analysis and dynamic code analysis. Static code analysis is the checking of code, without actually executing it. Static code analysis is performed on either the source code or in some cases the compiled object code or some representation of the source code. Static code analysis can be done entirely by code review by a human or the static code analysis process can be automated with the help of tools.

Why is static analysis important?

Software is prone to contain implicit or explicit bugs. Bugs can be faults in the code that could lead to failure of the software. Bugs could also be problems in code that make it less readable, less maintainable, inconsistent to conventions, etc. The problems with code may be more or less apparent and are not always discovered by testing. Without any form of static analysis, there is a risk that the code may reach production and cause problems in the future. Also, there is no guarantee that the code performs well, in all cases, is maintainable and of high quality without any form of static analysis. Hence, static analysis is an important part of software development.

Why static analysis tools?

Static analysis tools automate the tasks of checking for problems in code. Generally code, especially industry software code tends to have a large number of lines of code. Manually reviewing all of this code is tedious and subject to human shortcoming of overlooking problems, fatigue and shortcomings in knowledge of the code. On the other hand, if this process can be automated using tools, we can expect that all or most of the problems are caught and brought to the notice of the developer. We can expect tools to be repeatable and work on large volumes of code. Metrics for evaluating effectiveness of static analysis tools

Why to evaluate static analysis tools?

There are quite a few static analysis tools available for each language. Each of these tools finds a subset of all the bugs and there is some overlap in the bugs found. No one tool if clearly superior or does everything that another tool does. Hence, it is important to evaluate their effectiveness, know their features and use the one most suited for our needs. A number of metrics have been discussed for evaluating the effectiveness of bug finding tools: False Positives (FP): A false positive is a bug warning that is not relevant in the context of the current code or project. True Positives (TP): A true positive is a bug warning that correctly finds a bug. True Negative (TN): A true negative is any line of code where no bug warning was reported and that line was found to be bug free. False Negative (FN): A false negative is a line of code for which no bug warning was reported but bugs were later found.

There are other related metrics too such as: False Positive Rate (FPR) = FP/ (FP+TP) OR FP/(FP+TN) Precision = TP/ (FP+TP) OR Precision = 1 - FPR

Problems with static analysis tools

Static analysis tools analyze the tools without any knowledge of the constraints and assumptions of the programmer. Hence, generally static analysis tools may have the following problems:

The tool may show true positives, but less important defects.
The bugs shown may be known issues.
Bug warnings may be related to error conditions that will never occur.
Less impact bugs.

Static analysis tools in dynamically typed languages

Narration Static code analysis in Ruby How static analysis tools work How they identify problems quickly Examples

Static analysis tools for Ruby

Flay

Flay checks for parts of code that do the same thing, but on different variables/literals/constants. Thus we could replace this with a generic code that could have been used for both instances. Thus, flay is good for checking whether your code adheres to DRY principle. Flay uses sexp_processor and ruby_parser to examine the structure of Ruby code. The sexp_processor works on the parse tree (an abstract syntax tree representation) and helps you to focus on specific parts you are interested in and ignores the rest. Thus, . Flay is able to ignore differences in literal values, names, whitespace, and programming style when comparing subtrees to identify duplicate sections of code Flay. It is capable of detecting both exact and close matches. Flay's output is very primitive: a list of repeated code nodes, together with a weight to rank them by and line numbers and file names where they show up. Just gem install flay, and then flay *.rb to get playing with Flay. [1]

Features

Code at all levels is checked.
Identical nodes are counted.
Ignores differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc.
Works across files.
You can view a diff of code acroos versions.
Option to do conservative and libreal pruning.
Fuzzy duplication detection.
Language independent.
Ships with .rb and .erb.
Checks for Rakefiles also.

Suggestions for using Flay

When working with large pieces of code.
When trying to merge a branch into central trunk of code/product.
When we suspect that we are writing code that looks to be doing the same thing logically.

Flog

Flog is a tool by Ryan Davis and Eric Hodel that scores your Ruby code based on common patterns. It works based on counting your code's ABC (Assignment, Branches and Calls)[2] metric. All code has assignments, branches, and calls. Flog's job is to check that they aren't used excessively or abused. Excessive use signifies code is too complex and could be refactored. Each of the ABCs have different metrics associated with them. Flog Scoring Flog parses Ruby code and builds up a structure of the code using RubyParser. A complete listing of the scores assigned can be found in the source code [3]. A sub-list of the typical code segments and corresponding flog scores is:

method call - 0.2
assignments - 1
branching (and, case, else, if, or, rescue, until, when, while) - 1
block - 1
class_eval - 5
define_method - 5
eval - 5
extend - 2
include - 2
inject - 2

Flog goes through every class and method and scores everything. In the end we get a complete breakdown of the scores of each class and method. These scores can give us an idea of the complexity and quality of code. We can also use this to compare different pieces of code with each other.

Suggestions for using Flog

Use Flog when refactoring legacy code or code under development when continuous refactoring is to be done.
You can focus your refactoring efforts the parts of code which have the highest scores.
A rule of thumb for Flog scores is that any code with a score of 40 or above should be refactored.
Flog scores are related to the sum of ABCs. One way to reduce the Flog score is to split the methods and make the code more cohesive.
To learn more about how to refactor code that recieved high flog score, refer to this page [4].

Flog can be installed using sudo gem install flog.

Metric_fu

metric_fu is a nice gem which internally uses Saikuro, Flog, Flay, Rcov, Reek,Roodi, Churn, RailsBestPractices and generals useful quality metrics about the code. With the help of above it analyzes code for complexity, convention compliance, duplicate and unused code. Running metric:all will include: churn: It will shows which files change the most. coverage: It will show which parts of your code are tested flay: It will shows which parts of your code are duplicated flog: It will show if your code is unnecessarily complex reek: It will show if your code suffer from well-known bad practices saikuro: It will also show how complex is your code

Reek

It is a code smell detector for Ruby developed by Kevin Rutherford. Reek scans ruby code, either source files or in-memory Class objects, looking for some of the code smells.[5] The different code smells which Reek can detect are:

Attribute - Warns if a class publishes a getter or setter for an instance variable caused the client to become too intimate with the inner workings of the class
Class Variables - Warns that class variables are a part of the runtime state. Different parts of the system can inadvertently depend on other parts of the system causing unintended consequences
Control Coupling - Warns when a parameter is used to determine the execution path. This is duplication since the caller know what path should be taken
Data Clump - Warns when several items appear frequently together in classes or parameter lists
Duplication - Warns when two fragments of code look nearly identical
Irresponsible Module - Warns if classes or modules are not properly annotated
Long Method - Warns if a method has more that 5 statements. Every statement within a control structure (if, case, for, etc) is considered 1
Large Class - Warns if a class has more than a configurable number of methods or instance variables. These max values default to 25 for methods, and 9 for instance variables
Feature Envy - Warns if any method refers to self less often that it refers to another object
Uncommunicative Name - Warns if a name does not represent its intent well enough
Long Parameter List - Warns if a method has more than two parameters or if a method yields more than two objects to a block
Utility Function - Warns if a function has no dependency on the state of the instance
Nested Iterators - Warns if a block contains another block
Simulated Polymorphism - Warns if multiple conditionals test the same value throughout a class

To install Reek: gem install reek

To run it: reek file1.rb file2.rb ...

Roodi

It is similar to reek in that it allows to run a list of checks over a codebase. Roodi comes with checks that ensure methods or modules comply with a naming convention, max parameter count, etc. Other checks include advice such as avoiding for loops, etc. The shipped checks can also be easily configured with a YAML file. New checks can be easily written as well. A checker class registers the types of AST nodes it's interested in and can then handle the matched subtrees.

The checks that Roodi performs are: AssignmentInConditionalCheck - Warns if there is an assignment inside a conditional CaseMissingElseCheck - Warns if a case statement does not have an else statement, thus not covering all possibilities ClassLineCountCheck - Warns if the count of lines in a class is below threshold ClassNameCheck - Warns if class names do not match convention CyclomaticComplexityBlockCheck - Warns if the cyclomatic complexity of all blocks is below threshold CyclomaticComplexityMethodCheck - Warns if the cyclomatic complexity of all methods is below the threshold. EmptyRescueBodyCheck - Warns if there are empty rescue blocks. ForLoopCheck - Warns if for loops are used instead of Enumerable.each MethodLineCountCheck - Warns if the number of lines in a method is above threshold MethodNameCheck - Warns if method names do not match convention. ModuleLineCountCheck - Warns if the number of lines in a module is above threshold ModuleNameCheck - Warns if module names do not match convention ParameterNumberCheck - Warns if the number of parameters for a method is above threshold

Rufus

Rufus is written by John Mettraux. Ruby uses the standard ruby parse tree. Rufus uses this parse tree to convert code into SexpProcessor classes for evaluation.It checks for unsafe or potentially buggy code. The Rufus treechecker enable the developer to select what patterns to check errors for. The Rufus library allows to check some Ruby source code before loading it. The library can be configured to exclude code that matches some custom pattern. Rufus is actually made of several Gems that make up the ruote open source workflow. These Gems include: rufus-decision – CSV decision tables, in Ruby rufus-dollar – substituting ${stuff} in text strings rufus-lru – small LRU implementation (max size based) rufus-lua – Lua embedded in Ruby, via ruby FFI rufus-mnemo – turning integers into easier to remember ‘words’ and vice-et-versa rufus-rtm – A Remember The Milk gem rufus-scheduler – the gem formerly known as openwferu-scheduler, cron, at and every job scheduler rufus-sixjo – a Rack application, RESTfully serving rufus-sqs – a gem for interacting with Amazon SQS rufus-tokyo – a ruby-ffi based lib for handling Tokyo Cabinet hashes (or trees) rufus-treechecker – for checking untrusted code before an eval rufus-verbs – the verbs of HTTP, get, post, put, delete wrapped in a Ruby gem

Saikuro

Saikuro[6] is a cyclomatic complexity analyzer. It analyzes code and reports cyclomatic complexity of each method in the analyzed code. Cyclomatic complexity is a graphical measurement of the number of possible paths through the normal flow of a program. For example, a program with no branching statements has a score of one.Control flow graphs are used to measure the cyclomatic complexity of a method. These graphs consist of nodes and edges. Each node in a program represents a basic statement and each edge represents the changes in control flow of the program. It is better to keep the cyclomatic complexity low so that the code is simple and easy to follow and debug. In Saikaro each method has a complexity of 1 by default. In addition Saikuro adds 1 to the cyclomatic complexity for each of the below :

conditional and looping operator
each when in a case
rescue statements
blocks like each

It will counts the number of independent paths through the code.The higher the number that is returned the more complex the code. This means complex code "is more prone to error, harder to understand, harder to test, and harder to modify."In addition, Saikuro counts the number of lines per method and can generate a listing of the number of tokens on each line of code.

To install Saikuro: gem install saikuro

Example: ~$ saikuro -c -o saikuro_output -p test.rb

Rubocop

RuboCop is a Ruby static code analyzer based on the Ruby Style Guide. This Ruby style[7] guide recommends best practices so that real-world Ruby programmers can write code that can be maintained by other real-world Ruby programmers. Its purpose is to force you to write code that follows the style guide dictated by the Ruby community. It warns that the class doesn’t have a top-level documentation comment, the parenthesis in the method declaration are unnecessary, and double-quote strings aren’t needed when there is no string interpolation.

To Install Rubocop: gem install rubocop

You can run Rubocop directly in your app directory.

cd your_app

rubocop

Rubocop will review All your Ruby files in your application, including your gem files. If you want Rubocop to scan only a few directories then you can pass those as arguments.

CSC/ECE 517 Spring 2014/ch1 1w1b np

Contents

Background

What is static analysis?

Why is static analysis important?

Why static analysis tools?

Why to evaluate static analysis tools?

Problems with static analysis tools

Static analysis tools in dynamically typed languages

Static analysis tools for Ruby

Flay

Features

Suggestions for using Flay

Flog

Suggestions for using Flog

Metric_fu

Reek

Roodi

Rufus

Saikuro

Rubocop

Problems one might want to attack and how to solve them

Recommendations about tools

Hyperlink to terms and references

References

Navigation menu

CSC/ECE 517 Spring 2014/ch1 1w1b np

Background

What is static analysis?

Why is static analysis important?

Why static analysis tools?

Why to evaluate static analysis tools?

Problems with static analysis tools

Static analysis tools in dynamically typed languages

Static analysis tools for Ruby

Flay

Features

Suggestions for using Flay

Flog

Suggestions for using Flog

Metric_fu

Reek

Roodi

Rufus

Saikuro

Rubocop

Problems one might want to attack and how to solve them

Recommendations about tools

Hyperlink to terms and references

References

Navigation menu

Search