CSC/ECE 517 Spring 2013/ch1b 1m yk

From Expertiza_Wiki
Revision as of 01:40, 21 February 2013 by Ykolesn (talk | contribs)
Jump to navigation Jump to search

Continuous Integration

Definition

"We keep our code ready to ship"

Continuous Integration is a software development practice where members of a team integrate their work frequently. This approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly

Introduction

Continuous integration (CI) was first proposed as part of extreme programming. It aims to prevent costly integration problems through the daily integration of developer code. CI was originally seen as an extension of test driven development, where all unit tests would be run before the integration of code to the mainline. It has since evolved to a system where daily integration of code is facilitated by the heavy use of automated testing and build servers which report results back to developers.

How does CI work?

The above diagram shows an overview of a simple CI environment, which will give us a good idea of how CI works. <ref name="CIRef2">Thomas Jaspers. "Continuous Integration – Overview" http://blog.codecentric.de/en/2009/11/</ref>.In order to apply quality control throughout the project, CI requires members of the team integrate their work into the main code repository at least daily, which leads to multiple integrations per day. Each developer will first build and run all the tests locally to ensure all the changes are good, then commit the changes to the repository using some version control tools, such as Subversion. Once the CI sever detects the new commits, it will kick out an automated build that runs all the automated tests that could detect compile/integration errors as quickly as possible. If the automated build fails, the team will be notified and required to resolve the error immediately. In addition, at least once every 24 hours, a nightly build will be executed to ensure the quality of the code in the main repository. <ref name="CIRef5">Laurie Williams. "Scrum + Engineering Practices: Experiences of Three Microsoft Teams" http://collaboration.csc.ncsu.edu/laurie/Papers/ESEM11_SCRUM_Experience_CameraReady.pdf</ref>

Basic Practices to follow

Maintain a single source repository

Software projects involve lots of files that need to be orchestrated together to build a product. Keeping track of all of these is a major effort, particularly when there's multiple people involved. These tools - called Source Code Management tools, configuration management, version control systems, repositories, or various other names - are an integral part of most development projects. The base of a Continuous Integration system is to implement a good source control management system to keep track and control all of the files needed to build a product.

You must put everything required for a build in the source control system. One of the features of version control systems is that they allow you to create multiple branches, to handle different streams of development. This is a useful feature - but it's frequently overused and gets people into trouble. Keep your use of branches to a minimum. In particular have a mainline: a single branch of the project currently under development. Pretty much everyone should work off this mainline most of the time.

A comparison of the common version control systems

Automate the build

Getting the sources turned into a running system can often be a complicated process involving compilation, moving files around, loading schemas into the databases, and so on. However like most tasks in this part of software development it can be automated - and as a result should be automated. Asking people to type in strange commands or clicking through dialog boxes is a waste of time and a breeding ground for mistakes.

To get an efficient Continuous Integration system you need to implement an automatic build process. The details of this step depend upon the implementation language and development environment and may involve compiling and linking and/or other processes that produce the build artifacts, such as the program executable.

A comparison of the common build automation systems

Make the build self-testing

Traditionally a build means compiling, linking, and all the additional stuff required to get a program to execute. A program may run, but that doesn't mean it does the right thing. Modern statically typed languages can catch many bugs, but far more slip through that net. A good practice that will help you to detect errors quickly is to include some automated tests in your build process. If possible, execute a sub-set of the system tests as part of the build to ensure that the build is suitable prior to committing resources to the full scope of system testing. While the level of testing may vary, focus on gaining confidence that the increment is of sufficient quality to establish a baseline for system testing. If some testing fails, the build should also fail.

List of some common test automation tools

Everyone commits to the baseline every day

Integration is primarily about communication. Integration allows developers to tell other developers about the changes they have made. Frequent communication allows people to know quickly as changes develop. A good practice is to force developers to commit their changes to the main development stream at least once every day; before delivering the files, each developer must verify that his/her working copy is consistent with the main development stream, resolve conflicts if they exist, build a local test, and if this passes, commit changes to the main development stream.

Every Commit Should Build the Mainline on an Integration Machine

Although developers must run local builds and tests in their local machines before delivering to the main development stream, there may be differences between each developer's machine, or code integration errors. This is why it is very important to ensure that an integration build is run on an integration machine each time each a developer commits some changes. The developer that delivers the changes should be responsible for monitoring this integration build and fix it if there is any problem. Working this way ensures that errors can be found and solved easily and quickly. Developers are expected to commit to fixing problems with the build quickly, resulting in a successful build every day. There are three approaches to it:

  • To let the developer manually request a build execution after she has committed some new code to the stream.
  • To ensure that an automatic build will be executed after a code commit to the stream (uses a continuous integration server).
  • To use a scheduler so that automatic builds can be executed daily, weekly basis, etc. (uses a continuous integration server).

List of some popular continuous integration servers

Keep the build fast

It is important to reduce to a minimum the time spent running the integration build each time a developer commits changes to the main development stream. The problem is that this is not always possible, and complete integration builds can take a lot of time and resources. If this is your problem, a good best practice is to split your integration build into different stages: create a commit build that compiles and verifies the absence of critical errors when each developer commits changes to the main development stream, and create secondary build(s) to run slower and less important tests. You need to ensure that commit builds are always successful, but if any secondary build fails it does not have to be so critical as to stop everything. This allows your developers to continue working while these secondary errors are fixed.

Test in a clone of the production environment

The point of testing is to flush out, under controlled conditions, any problem that the system will have in production. A significant part of this is the environment within which the production system will run. If you test in a different environment, every difference results in a risk that what happens under test won't happen in production.

As a result you want to set up your test environment to be as exact a mimic of your production environment as possible. Use the same database software, with the same versions, use the same version of operating system. Put all the appropriate libraries that are in the production environment into the test environment, even if the system doesn't actually use them.

Make it easy for anyone to get the latest executable

Most builds produce useful output, such as an executable program, a packaged zip file, or other artifacts. Many people may need to get access to the latest executable to be able to run it or just to see what changed last week. Many times developers are not able to find it because there is not a well known place where these files are stored and they spend many hours just looking for this information. It is essential to make sure there's a well known place where people can find the latest executable. It may be useful to put several executables in such a store. For the very latest you should put the latest executable to pass the commit tests - such an executable should be pretty stable providing the commit suite is reasonably strong.

Everyone can see the results of the latest build

Communication is critical to implement a good Continuous Integration system. Each team member needs to have easy and transparent access to the state of the system, last changes made and state of the mainline integration build. Visibility and transparency of the flow of information between team members are essential.

Automate deployment

To do Continuous Integration you need multiple environments, one to run commit tests, one or more to run secondary tests. Since you are moving executables between these environments multiple times a day, you'll want to do this automatically. So it's important to have scripts that will allow you to deploy the application into any environment easily.

A natural consequence of this is that you should also have scripts that allow you to deploy into production with similar ease. You may not be deploying into production every day (although I've run into projects that do), but automatic deployment helps both speed up the process and reduce errors. It's also a cheap option since it just uses the same capabilities that you use to deploy into test environments.

Prefer synchronous integration, in which you wait for the integration to succeed, to asynchronous integration, in which a tool tests the integration for you. Synchronous integration requires fast builds, but ensures that they never break.

Advantages

Reduced risks. By integrating many times a day, you can reduce risks on your project. Doing so facilitates the detection of defects, the measurement of software health and a reduction of assumptions.

  • Defects are detected and fixed sooner.
  • Health of software is measurable.
  • Reduce assumptions.

Continuous Integration provides a safety net to reduce the risk that defects will be introduced into the code base. The following are some of the risks that Continuous Integration helps to mitigate:

  • Lack of cohesive, deployable software
  • Late defect discovery
  • Low-quality software
  • Lack of project visibility

Reduced repetitive manual processes. Reducing repetitive processes saves time, costs and effort. These repetitive processes can occur across all project activities, including code compilation, database integration, testing, inspection, deployment and feedback. By automating Continuous Integration, you have a greater ability to ensure all of the following.

  • The process runs the same way every time.
  • An ordered process is followed. For example, you may run inspections (static analysis) before you run tests-in your build scripts.
  • The processes will run every time a commit occurs in the version control repository.

Generate deployable software at any time and at any place. Continuous Integration can enable you to release deployable software at any point in time. From an outside perspective, this is the most obvious benefit of Continuous Integration. With Continuous Integration, you make small changes to the source code and integrate these changes with the rest of the code base on a regular basis. If there are problems, the project members are informed and the fixes are applied to the software immediately.

Enable better project visibility. Continuous Integration provides the ability to notice trends and make effective decisions, and it helps provide the courage to innovate new improvements. Projects suffer when there is no real or recent data to support decisions, so everyone offers their best guesses. Typically, project members collect this information manually, making the effort burdensome and untimely. The result is that often the information is never gathered. A Continuous Integration system can however provide just-in-time information on the recent build status and quality metrics. Some Continuous Integration systems can also show defect rates and feature completion statuses. Because integrations occur frequently with a Continuous Integration system, the ability to notice trends in build success or failure, overall quality and other pertinent project information becomes possible.

Establish greater confidence in the software product from the development team. Overall, effective application of Continuous Integration practices can provide greater confidence in producing a software product. With every build, your team knows that tests are run against the software to verify behavior, that project coding and design standards are met, and that the result is a functionally testable product. Since a Continuous Integration system can inform you when something goes wrong, developers and other team members have more confidence in making changes. Because Continuous Integration encourages a single-source point from which all software assets are built, there is greater confidence in its accuracy.

Disadvantages

Increased overhead in maintaining the Continuous Integration system. This is usually a misguided perception, because the need to integrate, test, inspect and deploy exists regardless of whether you are using Continuous Integration.

Too much change. Some may feel there are too many processes that need to change to achieve Continuous Integration for their legacy project. An incremental approach to Continuous Integration is most effective; first add builds and tests with a lower occurrence (for example, a daily build), then increase the frequency as everyone gets comfortable with the results.

Too many failed builds. Typically, this occurs when developers are not performing a private build before committing their code to the version-control repository. It could be that a developer forgot to check in a file or had some failed tests. Rapid response is imperative when using Continuous Integration because of the frequency of changes.

Additional hardware/software costs. To effectively use Continuous Integration, a separate integration machine should be acquired.

Developers should be performing these activities. Sometimes management feels that Continuous Integration is just duplicating the activities that developers should be performing anyway. Yes, developers should be performing some of these activities, but they need to perform them more effectively and reliably in a separate environment. Leveraging automated tools can improve the efficiency and frequency of these activities.

References

<references/>