CSC/ECE 517 Spring 2014/ch1 1w1b np

Background

What is static analysis?

An important aspect of software code is to be bug free. Most code that passes tests still contains bugs in them. Once a software passes the required tests, it is necessary to check for bugs in the code and check for quality of the code. This checking of code to improve quality is called code analysis. There are two main categories of code analysis, static code analysis and dynamic code analysis. Static code analysis is the checking of code, without actually executing it. Static code analysis is performed on either the source code or in some cases the compiled object code or some representation of the source code. Static code analysis can be done entirely by code review by a human or the static code analysis process can be automated by the help of tools.

Why is static analysis important?

Software is prone to contain implicit or explicit bugs. Bugs can be faults in the code that could lead to failure of the software. Bugs could also be problems in code that make it less readable, less maintainable, inconsistent to conventions, etc. The problems with code may be more or less apparent and are not always discovered by testing. Without any form of static analysis, there is a risk that the code may reach production and cause problems in the future. Also, there is no guarantee that the code performs well, in all cases, is maintainable and of high quality without any form of static analysis. Hence, static analysis is an important part of software development.

Why static analysis tools?

Static analysis tools automate the tasks of checking for problems in code. Generally code, especially industry software code tends to have a large number of lines of code. Manually reviewing all of this code is tedious and subject to human shortcoming of overlooking problems, fatigue and shortcomings in knowledge of the code. On the other hand, if this process can be automated using tools, we can expect that all or most of the problems are caught and brought to the notice of the developer. We can expect tools to be repeatable and work on large volumes of code. Metrics for evaluating effectiveness of static analysis tools

Why to evaluate static analysis tools?

Now, there are quite a few static analysis tools available for each language. Each of these tools finds a subset of all the bugs and there is some overlap in the bugs found. No one tool completely subsumes all the other and hence there is no clear winner. In such a situation, it becomes necessary to evaluate the effectiveness of these tools. A number of metrics have been discussed for evaluating the effectiveness of bug finding tools: False Positives (FP): A false positive is a bug warning that isn't really a bug. True Positives (TP): A true positive is a bug warning that correctly finds a bug and leads to a fix. True Negative (TN): A true negative is any line of code where no bug warning was reported and that line was found to be bug free. False Negative (FN): A false negative is a line of code for which no bug warning was reported but bugs were later found. This leads to the definition of other related metrics such as : False Positive Rate (FPR) = FP/ (FP+TP) OR FP/(FP+TN) Precision = TP/ (FP+TP) OR Precision = 1 - FPR

Problems with static analysis tools

Static analysis tools in dynamically typed languages

Narration Static code analysis in Ruby How static analysis tools work How they identify problems quickly Examples

Static analysis tools for Ruby

Flay

Flay checks for parts of code that do the same thing, but on different variables/literals/constants. This means that we could have written a generic code that could have been used for both instances. Thus, flay is good for checking whether your code adheres to DRY principle. Flay uses sexp_processor and ruby_parser to examine the structure of Ruby code. The sexp_processor works on the parse tree (an abstract syntax tree representation) and helps you to focus on specific parts you are interested in and ignores the rest. Thus, . Flay is able to ignore differences in literal values, names, whitespace, and programming style when comparing subtrees to identify duplicate sections of code [Flay]. It is capable of detecting both exact and close matches. Flay's output is very primitive: a list of repeated code nodes, together with a weight to rank them by and line numbers and file names where they show up. Just gem install flay, and then flay *.rb to get playing with Flay. [1] FEATURES/PROBLEMS: Reports differences at any level of code. Adds a score multiplier to identical nodes. Differences in literal values, variable, class, and method names are ignored. Differences in whitespace, programming style, braces vs do/end, etc are ignored. Works across files. Add the flay-persistent plugin to work across large/many projects. Run –diff to see an N-way diff of the code. Provides conservative (default) and –liberal pruning options. Provides –fuzzy duplication detection. Language independent: Plugin system allows other languages to be flayed. Ships with .rb and .erb. javascript and others will be available separately. Includes FlayTask for Rakefiles.

Flog

Flog is a tool by Ryan Davis and Eric Hodel that scores your Ruby code based on common patterns. It works based on counting your code's ABC (Assignment, Branches and Calls) metric. All code has assignments, branches, and calls. Flog's job is to check that they aren't used excessively or abused. Each of the ABCs have different metrics associated with them. INSTALL: sudo gem install flog Flog Scoring When flog is given Ruby code, flog parses it and builds up a structure of the code internally using RubyParser. A complete listing of all the scores assigned can be found in the source code. When it goes through every class and method until everything is scored. What you end up with is a breakdown of how each class and method scored.

Using the scoring, you can get an idea of how complex or poorly written a piece of code is compared to other code. You can focus your refactoring efforts the parts of code which have the highest scores. A good rule of thumb for Flog scores is that you will want to eventually refactor or at least think about refactoring any method when it has a score of 40 or more. Since flog scores are totals of all of the assignments, branches, and calls; using a refactoring method that splits a section of code in two is usually a good first refactoring to perform. To learn more about how to refactor code that recieved high flog score, refer to this page.

Metric_fu

metric_fu is a nice gem which internally uses Saikuro, Flog, Flay, Rcov, Reek,Roodi, Churn, RailsBestPractices and generals useful quality metrics about the code. With the help of above it analyzes code for complexity, convention compliance, duplicate and unused code. Running metric:all will include: churn: It will shows which files change the most. coverage: It will show which parts of your code are tested flay: It will shows which parts of your code are duplicated flog: It will show if your code is unnecessarily complex reek: It will show if your code suffer from well-known bad practices saikuro: It will also show how complex is your code

Reek

Is a code smell detector for Ruby developed by Kevin Rutherford. Reek scans ruby code — either source files or in-memory Class objects — looking for some of the code smells.[2] The different code smells which Reek can detect are: Attribute - Warns if a class publishes a getter or setter for an instance variable caused the client to become too intimate with the inner workings of the class Class Variables - Warns that class variables are a part of the runtime state. Different parts of the system can inadvertently depend on other parts of the system causing unintended consequences Control Coupling - Warns when a parameter is used to determine the execution path. This is duplication since the caller know what path should be taken Data Clump - Warns when several items appear frequently together in classes or parameter lists Duplication - Warns when two fragments of code look nearly identical Irresponsible Module - Warns if classes or modules are not properly annotated

Long Method -Warns if a method has more that 5 statements. Every statement within a control structure (if, case, for, etc) is considered 1 Large Class -Warns if a class has more than a configurable number of methods or instance variables. These max values default to 25 for methods, and 9 for instance variables Feature Envy - Warns if any method refers to self less often that it refers to another object Uncommunicative Name - Warns if a name does not represent its intent well enough Long Parameter List - Warns if a method has more than two parameters or if a method yields more than two objects to a block Utility Function - Warns if a function has no dependency on the state of the instance Nested Iterators - Warns if a block contains another block Simulated Polymorphism - Warns if multiple conditionals test the same value throughout a class To install Reek: gem install reek To run it: reek file1.rb file2.rb ...

Roodi

It is similar to reek in that it allows to run a list of checks over a codebase. Roodi comes with checks that ensure methods or modules comply with a naming convention, max parameter count, etc. Other checks include advice such as avoiding for loops, etc. The shipped checks can also be easily configured with a YAML file. New checks can be easily written as well. A checker class registers the types of AST nodes it's interested in and can then handle the matched subtrees.

The checks that Roodi performs are: AssignmentInConditionalCheck - Warns if there is an assignment inside a conditional CaseMissingElseCheck - Warns if a case statement does not have an else statement, thus not covering all possibilities ClassLineCountCheck - Warns if the count of lines in a class is below threshold ClassNameCheck - Warns if class names do not match convention CyclomaticComplexityBlockCheck - Warns if the cyclomatic complexity of all blocks is below threshold CyclomaticComplexityMethodCheck - Warns if the cyclomatic complexity of all methods is below the threshold. EmptyRescueBodyCheck - Warns if there are empty rescue blocks. ForLoopCheck - Warns if for loops are used instead of Enumerable.each MethodLineCountCheck - Warns if the number of lines in a method is above threshold MethodNameCheck - Warns if method names do not match convention. ModuleLineCountCheck - Warns if the number of lines in a module is above threshold ModuleNameCheck - Warns if module names do not match convention ParameterNumberCheck - Warns if the number of parameters for a method is above threshold

Rufus

Rufus written by John Mettraux and uses the standard ruby parse tree. It allows to check Ruby for unwanted or unsafe code It uses this parse tree to convert this code into SexpProcessor classes that can then be evaluated. By using the Rufus treechecker the developer can dictate what patterns to generate errors for: The Rufus library allows to check some Ruby source code before loading it. Eg. loading a Ruby file that consists of a single line like exit is probably a bad idea. The library can be configured with custom patterns of code to be excluded. Rufus is actually made of several Gems that make up the ruote open source workflow. These Gems include: rufus-decision – CSV decision tables, in Ruby rufus-dollar – substituting ${stuff} in text strings rufus-lru – small LRU implementation (max size based) rufus-lua – Lua embedded in Ruby, via ruby FFI rufus-mnemo – turning integers into easier to remember ‘words’ and vice-et-versa rufus-rtm – A Remember The Milk gem rufus-scheduler – the gem formerly known as openwferu-scheduler, cron, at and every job scheduler rufus-sixjo – a Rack application, RESTfully serving rufus-sqs – a gem for interacting with Amazon SQS rufus-tokyo – a ruby-ffi based lib for handling Tokyo Cabinet hashes (or trees) rufus-treechecker – for checking untrusted code before an eval rufus-verbs – the verbs of HTTP, get, post, put, delete wrapped in a Ruby gem

Saikuro

Saikuro analyzes code and reports cyclomatic complexity of each method in the analyzed code. Cyclomatic complexity is a graphical measurement of the number of possible paths through the normal flow of a program. For example, a program with no branching statements has a score of one. Calculation of the cyclomatic complexity of a method is done through the use of a control flow graph. These graphs consist of nodes and edges. Each node in a program represents a basic statement and each edge represents the changes in control flow of the program. It is better to keep the cyclomatic complexity low so that the code is simple and easy to follow and debug Each method has a complexity of 1 by default. In addition Saikuro adds 1 to the cyclomatic complexity for each of the below : conditional and looping operator each when in a case rescue statements blocks like each It will counts the number of independent paths through the code.[3] The higher the number that is returned the more complex the code. This means complex code "is more prone to error, harder to understand, harder to test, and harder to modify."[3] In addition, Saikuro counts the number of lines per method and can generate a listing of the number of tokens on each line of code.

To install Saikuro: gem install saikuro

Ruby-lint

Ruby-lint is a linter and static code analysis tool for Ruby. [3].ruby-lint primarily focuses on logic related errors such as the use of non existing variables instead of focusing on semantics

Rubocop

RuboCop is a Ruby static code analyzer. Out of the box it will enforce many of the guidelines outlined in the community Ruby Style Guide.

Problems one might want to attack and how to solve them Recommendations about tools Hyperlink to terms and references References

CSC/ECE 517 Spring 2014/ch1 1w1b np

Contents

Background

What is static analysis?

Why is static analysis important?

Why static analysis tools?

Why to evaluate static analysis tools?

Problems with static analysis tools

Static analysis tools in dynamically typed languages

Static analysis tools for Ruby

Flay

Flog

Metric_fu

Reek

Roodi

Rufus

Saikuro

Ruby-lint

Rubocop

Navigation menu

CSC/ECE 517 Spring 2014/ch1 1w1b np

Background

What is static analysis?

Why is static analysis important?

Why static analysis tools?

Why to evaluate static analysis tools?

Problems with static analysis tools

Static analysis tools in dynamically typed languages

Static analysis tools for Ruby

Flay

Flog

Metric_fu

Reek

Roodi

Rufus

Saikuro

Ruby-lint

Rubocop

Navigation menu

Search