CSC/ECE 517 Fall 2010/ch4 4h sk

From Expertiza_Wiki
Jump to navigation Jump to search

What is Static Code Analysis?

If there is one thing that has proven to be true over time is is that humans are fallible. We by nature, make mistakes. As a result, our humanity must be factored in to the software engineering and development that we produce. Development teams discovered early on that code review was the best way to discover mistakes and bugs in their software. This proved to be a daunting task however. Assembling teams to identify problems in the code base as well as training them took up large amounts of time and valuable resources. Obviously code review and bug discovery were critical areas of software development that needed to be addressed, but development teams needed a more efficient way to go about doing it.
In the 1970's Stephen Johnson at Bell Laboratories wrote a program call Lint. Lint's functionality was to look at C programs that had made it through compilation without any errors and examine the source code to locate any bugs that were not detected. With this, static code analysis was born.
Static code analysis defines tools used to verify the quality and reliability of software systems quickly and efficiently. There are many static code analysis tools out there for many different languages. Some of these are open source products and some are commercially based. Their scope and accuracy can vary quite a bit depending on what tool a development team uses. There are some traits of these tools that are for the most part universal however.

How do Static Code Analysis Tools Identify Problems?

While humans are fallible, at least we tend to be consistent about it. No matter the company or region of the world, software developers tend to fall into the same traps over and over. This creates patterns and known high risk situation that can be identified through programmatic means. Key aspects of a program that pretty much all static code analysis tool focus on are:
  • Input validation and representation - Based on different key combinations and encoding of alphanumeric characters, what user flows put the code at risk to attack.
  • API abuse - Does the caller violate the API terms
  • Security features - Are the security steps taken legal. Is the encryption used valid.
  • Time and state - Are race conditions introduced a components seek to share state.
  • Errors - Are there holes left in the code base that can be exploited.
  • Code Quality - Poor code quality introduces high risk situations and unpredictable behavior. The system may be stressed in unexpected ways.
  • Encapsulation - Are the boundaries between encapsulated code structures sound.
  • Environment - Anything outside the code base that might still be a security risk.

How do Static Code Analysis Tools Identify Problems Quickly?

First and foremost static analysis tools are efficient to use because they don't actually execute the code. These tools actually use what are known as static checkers which construct an abstract model of the code base and then travel through the program looking for patterns and common traps developers typically perform. Some of the most common and most useful criteria these static checkers use on this abstract model of the code base are:
  • Null Returns - Check to see if functions can return unexpected NULLs and create a segmentation fault
  • Forward Null - Check to see what paths generate a NULL pointer de-reference and create denial of service risks
  • Reverse Null - Check to see if pointers are checked against NULL before being de-referenced
  • Reverse Negative - Check to see if negative values are used in inappropriate places that open the program to security risk
  • Sizecheck - Check to see if correct memory allocation is used to prevent memory out of bounds errors
  • Resource Leak - Check to see if memory leaks exist that can introduce performance and crash problems
  • Use After Free - Check to see if once resources have been de-allocated from a heap, that these are not used again as they may cause nondeterministic results
  • Uninit - Check to see if variables are initialized before use
  • Overrun Static - Check to see if there are invalid accesses to a static array which may lead to buffer overrun security risks
  • Overrun Dynamic - Check to see if there are invalid accesses to a dynamic array which may lead to buffer overrun security risks
  • Negative Returns - Check to see if a function returns a negative value that is returned inappropriately which may lead to memory corruption, crashes, infinite loops, etc.
This process is much more efficient then having to compile the entire code base every time the development team wanted to run analysis tools on their software. These tools are also quite flexible. Many of which allow the rules and constraints used by these tools to be customized for the specific need of a particular development studio.

Is Static Code Analysis Important?

In a word, yes! As the software industry has progressed, design systems have grown more complex and product functionality more robust. It has also been shown that bug correction at the construction stage is much less expensive then correcting it at the testing phase. It has also been shown that static analysis tools reduce software defects by a factor of six and also detect 60% of possible post release failures. There are some tests that have even shown that up to 91% of errors can be removed from source code using static analysis.
One very important area where static analysis excels is with data flow analysis. This is where the static checkers observe the flow users take as they would use the software. The purpose of this is to do vulnerability checking. Hackers and abusive software are a fact of life in today's world. Common exploits such as buffer overflow attacks and sql injection should be a concerns for every developer. Static code analysis is a way to discover if these vulnerabilities exist in the code base. These tools are not only important for discovering bugs and pointing out vulnerabilities, they can also analyze paths in which memory is allocated but never freed. This lets the development team know if there are memory leaks in their software. Other key uses for static analysis tools include the detection of program crashes and concurrency problems.
To give a concrete example of the importance of these tools, we will use medical software. Medical software is rapidly growing in its sophistication. It has grown to the point where the FDA has identified static code analysis as a means of improving the quality and reliability of the software across the medical profession.

Are There Drawbacks to Using Static Code Analysis?

Static Code Analysis in Dynamically Typed Languages

Static Code Analysis in Ruby

Conclusions

References