CSC/ECE 517 Spring 2014/ch1 1w1b np

From Expertiza_Wiki
Revision as of 03:42, 19 February 2014 by Nvnaik (talk | contribs)
Jump to navigation Jump to search

Static Analysis Tools for Ruby

An important aspect of software code is to be bug free. Most code that passes tests still contains bugs in them. Once a software passes the required tests, it is necessary to check for bugs in the code and check for quality of the code. This checking of code to improve quality is called code analysis. There are two main categories of code analysis, static code analysis and dynamic code analysis. Static code analysis is the checking of code, without actually executing it. Static code analysis is performed on either the source code or in some cases the compiled object code or some representation of the source code. Static code analysis can be done entirely by code review by a human or the static code analysis process can be automated with the help of tools. Here are links to other pages that address the same topic one, two and a link to the writeup page.

Background

Why is static analysis important?

Software is prone to contain implicit or explicit bugs. Bugs can be defects in the code that could lead to failure of the software. Bugs could also be problems in code that make it less readable, less maintainable, inconsistent to conventions, etc. The problems with code may be more or less apparent and are not always discovered by testing. Without any form of static analysis, there is a risk that the code may reach production and cause problems in the future. Also, there is no guarantee that the code performs well, in all cases, is maintainable and of high quality without any form of static analysis. Hence, static analysis is an important part of software development.

Why static analysis tools?

Static analysis tools automate the tasks of checking for problems in code. Generally code, especially industry software code tends to have a large number of lines of code. Manually reviewing all of this code is tedious and subject to human shortcoming of overlooking problems, fatigue and shortcomings in knowledge of the code. On the other hand, if this process can be automated using tools, we can expect that all or most of the problems are caught and brought to the notice of the developer. We can expect tools to be repeatable and work on large volumes of code. Metrics for evaluating effectiveness of static analysis tools

Why to evaluate static analysis tools?

There are quite a few static analysis tools available for each language. Each of these tools finds a subset of all the bugs and there is some overlap in the bugs found. No one tool if clearly superior or does everything that another tool does. Hence, it is important to evaluate their effectiveness, know their features and use the one most suited for our needs. A number of metrics have been discussed for evaluating the effectiveness of bug finding tools:

  • False Positives (FP): A false positive is a bug warning that is not relevant in the context of the current code or project.
  • True Positives (TP): A true positive is a bug warning that correctly finds a bug.
  • True Negative (TN): A true negative is any line of code where no bug warning was reported and that line was found to be bug free.
  • False Negative (FN): A false negative is a line of code for which no bug warning was reported but bugs were later found.

There are other related metrics too such as:

  • False Positive Rate (FPR) = FP/ (FP+TP) OR FP/(FP+TN)
  • Precision = TP/ (FP+TP) OR Precision = 1 - FPR

Problems with static analysis tools

Static analysis tools analyze the tools without any knowledge of the constraints and assumptions of the programmer. Hence, generally static analysis tools may have the following problems:

  • The tool may show true positives, but less important defects.
  • The bugs shown may be known issues.
  • Bug warnings may be related to error conditions that will never occur.
  • Less impact bugs.

Static analysis tools for Ruby

Flay

Introduction

Flay checks for parts of code that do the same thing, but on different variables/literals/constants. Thus we could replace this with a generic code that could have been used for both instances. Thus, flay is good for checking whether your code adheres to DRY principle. Flay uses sexp_processor and ruby_parser to examine the structure of Ruby code. The sexp_processor works on the parse tree (an abstract syntax tree representation) and helps you to focus on specific parts you are interested in and ignores the rest. Thus, . Flay is able to ignore differences in literal values, names, whitespace, and programming style when comparing subtrees to identify duplicate sections of code Flay. It is capable of detecting both exact and close matches. Flay's output is very primitive: a list of repeated code nodes, together with a weight to rank them by and line numbers and file names where they show up. Information regarding some other tools for removing repeated code can be found on this page. Just gem install flay, and then flay *.rb to get playing with Flay.

Features

  • Code at all levels is checked.
  • Identical nodes are counted.
  • Ignores differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc.
  • Works across files.
  • You can view a diff of code acroos versions.
  • Option to do conservative and libreal pruning.
  • Fuzzy duplication detection.
  • Language independent.
  • Ships with .rb and .erb.
  • Checks for Rakefiles also.

To Install Flay: gem install flay

Suggestions for using Flay

  • When working with large pieces of code.
  • When trying to merge a branch into central trunk of code/product.
  • When we suspect that we are writing code that looks to be doing the same thing logically.

Flog

Introduction

Flog is a tool by Ryan Davis and Eric Hodel that scores your Ruby code based on common patterns. It works based on counting your code's ABC (Assignment, Branches and Calls) metric. All code has assignments, branches, and calls. Flog's job is to check that they aren't used excessively or abused. Excessive use signifies code is too complex and could be refactored. Each of the ABCs have different metrics associated with them. Flog Scoring Flog parses Ruby code and builds up a structure of the code using RubyParser. A complete listing of the scores assigned can be found in the source code. A sub-list of the typical code segments and corresponding flog scores is:

  • method call - 0.2
  • assignments - 1
  • branching (and, case, else, if, or, rescue, until, when, while) - 1
  • block - 1
  • class_eval - 5
  • define_method - 5
  • eval - 5
  • extend - 2
  • include - 2
  • inject - 2

Flog goes through every class and method and scores everything. In the end we get a complete breakdown of the scores of each class and method. These scores can give us an idea of the complexity and quality of code. We can also use this to compare different pieces of code with each other.

Example

Report from Flog looks like:

Total score = 211.720690020501
   
  WatchR#analyze_entry: (34.2)
     9.8: assignment
     7.0: branch
     4.5: mark_host_last_seen
     3.2: pattern
     2.8: []
     2.8: is_event?
     2.0: alert_type
     2.0: alert_target
     1.8: alert_msg
     1.8: notify
     1.6: event_notify?
     1.3: notify_log
     1.3: join
     1.3: split
     1.3: each
     1.3: now
     1.3: each_value
     1.3: record_host_if_unknown
     0.4: lit_fixnum
  WatchR#event_threshold_reached?: (31.6)
    21.3: []
     2.6: branch
     1.8: tv_sec
     1.6: -
     1.5: length
     1.4: >
     1.4: assignment
     1.3: >=
     1.3: mark_alert_last_seen
     1.3: delete_if

To install Flog: gem install flog.

Suggestions for using Flog

  • Use Flog when refactoring legacy code or code under development when continuous refactoring is to be done.
  • You can focus your refactoring efforts the parts of code which have the highest scores.
  • A rule of thumb for Flog scores is that any code with a score of 40 or above should be refactored.
  • Flog scores are related to the sum of ABCs. One way to reduce the Flog score is to split the methods and make the code more cohesive.
  • To learn more about how to refactor code that recieved high flog score, refer to this page.

Reek

Introduction

Reek is a code smell detector for Ruby developed by Kevin Rutherford. Reek scans ruby code, either source files or in-memory Class objects, looking for some of the code smells The different code smells which Reek can detect are:

  • Attribute - Warns if a class publishes a getter or setter for an instance variable caused the client to become too intimate with the inner workings of the class
  • Class Variables - Warns that class variables are a part of the runtime state. Different parts of the system can inadvertently depend on other parts of the system causing unintended consequences
  • Control Coupling - Warns when a parameter is used to determine the execution path. This is duplication since the caller know what path should be taken
  • Data Clump - Warns when several items appear frequently together in classes or parameter lists
  • Duplication - Warns when two fragments of code look nearly identical
  • Irresponsible Module - Warns if classes or modules are not properly annotated
  • Long Method - Warns if a method has more that 5 statements. Every statement within a control structure (if, case, for, etc) is considered 1
  • Large Class - Warns if a class has more than a configurable number of methods or instance variables. These max values default to 25 for methods, and 9 for instance variables
  • Feature Envy - Warns if any method refers to self less often that it refers to another object
  • Uncommunicative Name - Warns if a name does not represent its intent well enough
  • Long Parameter List - Warns if a method has more than two parameters or if a method yields more than two objects to a block
  • Utility Function - Warns if a function has no dependency on the state of the instance
  • Nested Iterators - Warns if a block contains another block
  • Simulated Polymorphism - Warns if multiple conditionals test the same value throughout a class

To install Reek: gem install reek

To run it: reek file1.rb file2.rb ...

Example

Given a source file demo.rb containing:

class Dirty
  # This method smells of :reek:NestedIterators but ignores them
  def awful(x, y, offset = 0, log = false)
    puts @screen.title
    @screen = widgets.map {|w| w.each {|key| key += 3}}
    puts @screen.contents
  end
end

Reek will report the following code smells in this file:

$ reek demo.rb
  spec/samples/demo/demo.rb -- 6 warnings:
  Dirty has no descriptive comment (IrresponsibleModule)
  Dirty#awful has 4 parameters (LongParameterList)
  Dirty#awful has boolean parameter 'log' (ControlCouple)
  Dirty#awful has the parameter name 'x' (UncommunicativeName)
  Dirty#awful has the parameter name 'y' (UncommunicativeName)
  Dirty#awful has the variable name 'w' (UncommunicativeName)

Suggestions for using Reek

Reek warns about different design issues. Once you write the code for a feature, you can use Reek to refactor your code to improve your design without affecting the functionality of your program.

Roodi

Introduction

Roodi stands for Ruby Object Oriented Design Inferometer. It is similar to reek in which it allows to run a list of checks over a codebase. Roodi comes with checks that ensure methods or modules comply with a naming convention, max parameter count, etc. Other checks include advice such as avoiding for loops, etc. The shipped checks can also be easily configured with a YAML file. New checks can be easily written as well. A checker class registers the types of AST nodes it's interested in and can then handle the matched subtrees.

The checks that Roodi performs are:

  • AssignmentInConditionalCheck - Warns if there is an assignment inside a conditional
  • CaseMissingElseCheck - Warns if a case statement does not have an else statement, thus not covering all possibilities
  • ClassLineCountCheck - Warns if the count of lines in a class is below threshold
  • ClassNameCheck - Warns if class names do not match convention
  • CyclomaticComplexityBlockCheck - Warns if the cyclomatic complexity of all blocks is below threshold
  • CyclomaticComplexityMethodCheck - Warns if the cyclomatic complexity of all methods is below the threshold.
  • EmptyRescueBodyCheck - Warns if there are empty rescue blocks.
  • ForLoopCheck - Warns if for loops are used instead of Enumerable.each
  • MethodLineCountCheck - Warns if the number of lines in a method is above threshold
  • MethodNameCheck - Warns if method names do not match convention.
  • ModuleLineCountCheck - Warns if the number of lines in a module is above threshold
  • ModuleNameCheck - Warns if module names do not match convention
  • ParameterNumberCheck - Warns if the number of parameters for a method is above threshold

To install Roodi: gem install roodi

To run Roodi: $ roodi

This will check all ruby files recursively under the current directory.

Example

Roodi's output looks like this:

:rspec/lib/spec/rake/spectask.rb:152 - Block cyclomatic complexity is 11.  It should be 4 or less.
:rspec/lib/spec/rake/verify_rcov.rb:37 - Block cyclomatic complexity is 6.  It should be 4 or less.
:rspec/lib/spec/matchers/be.rb:57 - Method name "match_or_compare" has a cyclomatic complexity is 12.  It should be 8 or less.
:rspec/lib/spec/matchers/change.rb:12 - Method name "matches?" has a cyclomatic complexity is 9.  It should be 8 or less.
:rspec/lib/spec/matchers/have.rb:28 - Method name "matches?" has a cyclomatic complexity is 11.  It should be 8 or less.
:rspec/lib/spec/expectations/errors.rb:6 - Rescue block should not be empty.
:rspec/lib/spec/rake/spectask.rb:186 - Rescue block should not be empty.
:rspec/lib/spec/expectations/differs/default.rb:20 - Method name "diff_as_string" has 21 lines.  It should have 20 or less.
:rspec/lib/spec/matchers/change.rb:35 - Method name "failure_message" has 24 lines.  It should have 20 or less.
:rspec/lib/spec/matchers/include.rb:31 - Method name "_message" should match pattern (?-mix:^[a-z]+[a-z0-9_]*[!\?]?$).
:rspec/lib/spec/matchers/include.rb:35 - Method name "_pretty_print" should match pattern (?-mix:^[a-z]+[a-z0-9_]*[!\?]?$).

Suggestions for using Roodi

Similar to Reek, Roodi can be used for refactoring the code and improving the design. The user can configure Roodi based on project requirements, like add a maximum limit to the number of parameters a method can have, threshold for cyclomatic complexity etc.

Rufus

Introduction

Rufus is written by John Mettraux. Ruby uses the standard ruby parse tree. Rufus uses this parse tree to convert code into SexpProcessor classes for evaluation. It checks for unsafe or potentially buggy code. The Rufus treechecker enable the developer to select what patterns to check errors for. The Rufus library allows to check some Ruby source code before loading it. The library can be configured to exclude code that matches some custom pattern. Rufus is actually made of several Gems that make up the route open source workflow. These Gems include:

  • rufus-decision – CSV decision tables, in Ruby
  • rufus-dollar – substituting ${stuff} in text strings
  • rufus-lru – small LRU implementation (max size based)
  • rufus-lua – Lua embedded in Ruby, via ruby FFI
  • rufus-mnemo – turning integers into easier to remember ‘words’ and vice-et-versa
  • rufus-rtm – A Remember The Milk gem
  • rufus-scheduler – the gem formerly known as openwferu-scheduler, cron, at and every job scheduler
  • rufus-sixjo – a Rack application, RESTfully serving
  • rufus-sqs – a gem for interacting with Amazon SQS
  • rufus-tokyo – a ruby-ffi based lib for handling Tokyo Cabinet hashes (or trees)
  • rufus-treechecker – for checking untrusted code before an eval
  • rufus-verbs – the verbs of HTTP, get, post, put, delete wrapped in a Ruby gem
:require 'rubygems'
:require 'rufus-treechecker'

:tc = Rufus::TreeChecker.new do
:  exclude_fvcall :abort
:  exclude_fvcall :exit, :exit!
:end

:tc.check("1 + 1; abort")               # will raise a SecurityError
:tc.check("puts (1..10).to_a.inspect")  # OK

Suggestions for using Rufus

Use Rufus when you are concerned about the safety of the code for e.g. to check whether it's safe before calling eval().

Saikuro

Introduction

Saikuro is a cyclomatic complexity analyzer. It analyzes code and reports cyclomatic complexity of each method in the analyzed code. Cyclomatic complexity [1] is a graphical measurement of the number of possible paths through the normal flow of a program. Control flow graphs are used to measure the cyclomatic complexity of a method. These graphs consist of nodes and edges. Each node in a program represents a basic statement and each edge represents the changes in control flow of the program. It is better to keep the cyclomatic complexity low so that the code is simple and easy to follow and debug. In Saikuro each method has a complexity of 1 by default. For each of the following, Saikuro adds 1 to the cyclometic complexity:

  • conditional and looping operator
  • each when in a case
  • rescue statements
  • blocks like each

Saikuro counts the number of independent paths through the code. Saikuro can also count the number of lines per method and the number of tokens per line. Higher numbers returned indicate more complex code. Complex code is less readable, more prone to bugs, harder to maintain and hard to test for all conditions.

To install Saikuro: gem install saikuro

To run Saikuro: ~$ saikuro -c -o saikuro_output -p test.rb

Example

Suggestions for using Saikuro

  • Code complexity doesn't always relate to bad code but code definitely can be refactored to be less complex.
  • Allow multiple returns from a method, if it improves the readability and cyclomatic complexity.
  • Try to reduce branches.
  • If necesarry perform input validation using separate helper functions.

Rubocop

Introduction

RuboCop is based on the Ruby Style Guide. This Ruby style guide recommends best practices and standard conventions to follow while coding Ruby. Adhering to coding standards has several benefits since code is readable and if everybody follows conventions, other Ruby programmers can easily understand the code and extend it. Code that follows conventions is easy to support. Rubocop checks that check for stylistics problems in your code are called style cops. Most of them are based on the Ruby Style Guide. Style cops also provide configuration option that allow them to support different support different popular coding conventions.

To Install Rubocop: gem install rubocop

You can run Rubocop directly in your app directory.

cd your_app

rubocop

Rubocop will review All your Ruby files in your application, including your gem files. Specific files can also be checked by passing them as arguments while calling Rubocop.

rubocop app spec config/application.rb

Suggestions on using RuboCop

  • Since RuboCop only checks for stylistic problems, we should use and configure RuboCop only if it works for the coding conventions decided upon for that project or company.
  • To avoid seeing too many warnings, it is advisable to use RuboCop on individual files only.
  • RuboCop focuses on the readability aspect of code and should be done after other major parts of refactoring are done.

Metric_fu

Introduction

Metric_fu is a nice gem which internally uses Saikuro, Flog, Flay, Rcov, Reek, Roodi, Churn, RailsBestPractices and generally useful quality metrics about the code. With the help of above it analyzes code for complexity, convention compliance, duplicate and unused code.

Running metric:all will include:

  • churn: It will shows which files change the most.
  • coverage: It will show which parts of your code are tested
  • flay: It will shows which parts of your code are duplicated
  • flog: It will show if your code is unnecessarily complex
  • reek: It will show if your code suffer from well-known bad practices
  • saikuro: It will also show how complex is your code

To Install Metric_fu: gem install metric_fu

Suggestions for using Metric_fu

Metric_fu combines several different tools that provide reports that show which parts of your code might need extra work. So, if you are looking to find out which part of your code requires refactoring running metric_fu would be of great use. Metric_fu also provides individual results generated by each of these tools and looking at these results the user could decide which tool to run in detail to identify the problem areas in the code.

User scenarios and tools

From our study of the tools, we think that each tool has its own usability. Under different circumstances, one would find a different tool useful. Thus, it would be useful to have a mapping of scenarios and the appropriate tool to use in that case. We would like to suggest the following mapping:

Scenario Tool Explanation
Beginning of coding activity Roodi Roodi warns against bad design and we would prefer being warned about design problems early into the development rather than later.
When first few classes or modules are being fleshed out Rubocop To get familiarized with the stylistic conventions. Typically best for new developers.
Quick analysis of code after basic functionality implemented and tested Metric fu So that we can code, test and analyze. Then depending on the tool that returned most warnings, we could focus our attention on that particular aspect of the code.
Continuous code improvement / refactoring Reek, Rufus To check for code smells and potential bugs.
When code is large and/or you find yourself logically repeating yourself Flay When code base becomes large enough and you want to keep your code DRY, Flay can help detect repetition.
Improve readability Saikuro, Flog To reduce code complexity and to make code readable and ease support, especially legacy code.

Benefits of static analysis tools

In conclusion, the static analysis tools mentioned about could provide the following benefits:

  • Code that adheres to standards and conventions.
  • Learning of the language and object oriented concepts from the warnings generated by tools.
  • Better readability of code.
  • Maintainable code.
  • Fewer defects.
  • Reduced code complexity.
  • Aid to refactoring.

References