CSC/ECE 517 Fall 2011/ch6 6f va

From Expertiza_Wiki
Jump to navigation Jump to search

Continuous Integration

Definition

"We keep our code ready to ship"

Continuous Integration is a software development practice where members of a team integrate their work frequently. This approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly

Introduction

Software development landscape has changed lately. In the previous years, one third of the development time was spent on writing exhaustive specifications, which then went through intensive reviews, and then spend months before something was released; in the last years the focus was changed. Nowadays the specs are sketches at most, and releases are at least twice a month in order to get rapid feedback from the customer, allowing for a more focused product.

The big question is though: How can we developers adapt to the new world?

Continuous Integration aims to improve the quality of software, and to reduce the time taken to deliver it, by replacing the traditional practice of applying quality control after completing all development. It is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day.

Continuous Integration is about integrating changes to the development project continuously and automatically. These changes usually come in form of modifications and additions to the source code. When we keep them as small as possible and automatically run all checks available (starting from the compilation process and up to the full test routines), we can detect most breaking issues whenever they are introduced. Since the changes are small, it is much easier to isolate and fix the root cause

Process Involved

Continuous integration is achieved by reducing development friction involved in project integration. This is done with the help of:

  • Version Control Systems that allow large and distributed teams to work together on the same project, while sending their code changes (commits) to the central server. Each commit is stored on the server and could be reviewed or reverted.
  • Build Automation that is basically about build tools and scripts that allow to start the complete integration build on a local machine with a single mouse-click.
  • Unit Tests ensuring that code stays stable through all the changes (this normally comes with the Test-Driven Development practices).
  • Code Quality Tests that consistently enforce development guidelines accepted by the team.
  • Regular Commits by developers that reduce amount of change involved in every integration.
  • Integrating Every Commit (as soon as it comes) on a central server allows to detect any compilation failures or broken tests immediately. Also this is the best way to deal with "but it works on my machine" problem.
  • Publishing Integration Results that get created along with the build project might simplify getting feedback from people outside of the development team. Examples of these results are: unit test results, performance metrics, install packages, code quality reports etc.

Basic Practices to follow

Maintain a single source repository

Software projects involve lots of files that need to be orchestrated together to build a product. Keeping track of all of these is a major effort, particularly when there's multiple people involved. These tools - called Source Code Management tools, configuration management, version control systems, repositories, or various other names - are an integral part of most development projects. The base of a Continuous Integration system is to implement a good source control management system to keep track and control all of the files needed to build a product.

You must put everything required for a build in the source control system. One of the features of version control systems is that they allow you to create multiple branches, to handle different streams of development. This is a useful feature - but it's frequently overused and gets people into trouble. Keep your use of branches to a minimum. In particular have a mainline: a single branch of the project currently under development. Pretty much everyone should work off this mainline most of the time.

A comparison of the common version control systems

Automate the build

Getting the sources turned into a running system can often be a complicated process involving compilation, moving files around, loading schemas into the databases, and so on. However like most tasks in this part of software development it can be automated - and as a result should be automated. Asking people to type in strange commands or clicking through dialog boxes is a waste of time and a breeding ground for mistakes.

To get an efficient Continuous Integration system you need to implement an automatic build process. The details of this step depend upon the implementation language and development environment and may involve compiling and linking and/or other processes that produce the build artifacts, such as the program executable.

A comparison of the common build automation systems

Make the build self-testing

Traditionally a build means compiling, linking, and all the additional stuff required to get a program to execute. A program may run, but that doesn't mean it does the right thing. Modern statically typed languages can catch many bugs, but far more slip through that net. A good practice that will help you to detect errors quickly is to include some automated tests in your build process. If possible, execute a sub-set of the system tests as part of the build to ensure that the build is suitable prior to committing resources to the full scope of system testing. While the level of testing may vary, focus on gaining confidence that the increment is of sufficient quality to establish a baseline for system testing. If some testing fails, the build should also fail.

List of some common test automation tools

Everyone commits to the baseline every day

Integration is primarily about communication. Integration allows developers to tell other developers about the changes they have made. Frequent communication allows people to know quickly as changes develop. A good practice is to force developers to commit their changes to the main development stream at least once every day; before delivering the files, each developer must verify that his/her working copy is consistent with the main development stream, resolve conflicts if they exist, build a local test, and if this passes, commit changes to the main development stream.

Every Commit Should Build the Mainline on an Integration Machine

Although developers must run local builds and tests in their local machines before delivering to the main development stream, there may be differences between each developer's machine, or code integration errors. This is why it is very important to ensure that an integration build is run on an integration machine each time each a developer commits some changes. The developer that delivers the changes should be responsible for monitoring this integration build and fix it if there is any problem. Working this way ensures that errors can be found and solved easily and quickly. Developers are expected to commit to fixing problems with the build quickly, resulting in a successful build every day. There are three approaches to it:

  • To let the developer manually request a build execution after she has committed some new code to the stream.
  • To ensure that an automatic build will be executed after a code commit to the stream (uses a continuous integration server).
  • To use a scheduler so that automatic builds can be executed daily, weekly basis, etc. (uses a continuous integration server).

List of some popular continuous integration servers

Keep the build fast

It is important to reduce to a minimum the time spent running the integration build each time a developer commits changes to the main development stream. The problem is that this is not always possible, and complete integration builds can take a lot of time and resources. If this is your problem, a good best practice is to split your integration build into different stages: create a commit build that compiles and verifies the absence of critical errors when each developer commits changes to the main development stream, and create secondary build(s) to run slower and less important tests. You need to ensure that commit builds are always successful, but if any secondary build fails it does not have to be so critical as to stop everything. This allows your developers to continue working while these secondary errors are fixed.

Test in a clone of the production environment

The point of testing is to flush out, under controlled conditions, any problem that the system will have in production. A significant part of this is the environment within which the production system will run. If you test in a different environment, every difference results in a risk that what happens under test won't happen in production.

As a result you want to set up your test environment to be as exact a mimic of your production environment as possible. Use the same database software, with the same versions, use the same version of operating system. Put all the appropriate libraries that are in the production environment into the test environment, even if the system doesn't actually use them.

Make it easy for anyone to get the latest executable

Most builds produce useful output, such as an executable program, a packaged zip file, or other artifacts. Many people may need to get access to the latest executable to be able to run it or just to see what changed last week. Many times developers are not able to find it because there is not a well known place where these files are stored and they spend many hours just looking for this information. It is essential to make sure there's a well known place where people can find the latest executable. It may be useful to put several executables in such a store. For the very latest you should put the latest executable to pass the commit tests - such an executable should be pretty stable providing the commit suite is reasonably strong.

Everyone can see the results of the latest build

Communication is critical to implement a good Continuous Integration system. Each team member needs to have easy and transparent access to the state of the system, last changes made and state of the mainline integration build. Visibility and transparency of the flow of information between team members are essential.

Automate deployment

To do Continuous Integration you need multiple environments, one to run commit tests, one or more to run secondary tests. Since you are moving executables between these environments multiple times a day, you'll want to do this automatically. So it's important to have scripts that will allow you to deploy the application into any environment easily.

A natural consequence of this is that you should also have scripts that allow you to deploy into production with similar ease. You may not be deploying into production every day (although I've run into projects that do), but automatic deployment helps both speed up the process and reduce errors. It's also a cheap option since it just uses the same capabilities that you use to deploy into test environments.

Prefer synchronous integration, in which you wait for the integration to succeed, to asynchronous integration, in which a tool tests the integration for you. Synchronous integration requires fast builds, but ensures that they never break.

Agile Project Development: An Example Workflow

Let's assume a Developer A has to do something to a piece of software, it doesn't really matter what the task is, for the moment let us assume it's small and can be done in a few hours. Developer A begins by taking a copy of the current integrated source onto her local development machine, by using a source code management system by checking out a working copy from the mainline.

Now Developer A takes her working copy and does whatever she needs to do to complete her task. This will consist of both altering the production code, and also adding or changing automated tests. Once Developer A is done, she carry outs an automated build on her development machine. This takes the source code in her working copy, compiles and links it into an executable, and runs the automated tests. Only if it all builds and tests without errors is the overall build considered to be good.

With a good build, Developer A can then think about committing her changes into the repository. The twist is that other people may, and usually have, made changes to the mainline before she gets a chance to commit. So first Developer A updates her working copy with their changes and rebuilds. If their changes clash with her changes, it will manifest as a failure either in the compilation or in the tests. In this case it's Developer A’s responsibility to fix this and repeat until she can build a working copy that is properly synchronized with the mainline. Once she has made her own build of a properly synchronized working copy, she can then finally commit her changes into the mainline, which then updates the repository.

However Developer A’s commits doesn't finish her work. At this point she builds again, but this time on an integration machine based on the mainline code. Only when this build succeeds can she say that my changes are done. There is always a chance that Developer A missed something on her machine and the repository wasn't properly updated. Only when her committed changes build successfully on the integration is the job done.

If a clash occurs between two developers, it is usually caught when the second developer to commit builds their updated working copy. If not the integration build should fail. Either way the error is detected rapidly. At this point the most important task is to fix it, and get the build working properly again. In a Continuous Integration environment you should never have a failed integration build stay failed for long. A good team should have many correct builds a day. Bad builds do occur from time to time, but should be quickly fixed.

The result of doing this is that there is a stable piece of software that works properly and contains few bugs. Everybody develops off that shared stable base and never gets so far away from that base that it takes very long to integrate back with it. Less time is spent trying to find bugs because they show up quickly


The above figure depicts a Workflow Diagram

Advantages

Reduced risks. By integrating many times a day, you can reduce risks on your project. Doing so facilitates the detection of defects, the measurement of software health and a reduction of assumptions.

  • Defects are detected and fixed sooner.
  • Health of software is measurable.
  • Reduce assumptions.

Continuous Integration provides a safety net to reduce the risk that defects will be introduced into the code base. The following are some of the risks that Continuous Integration helps to mitigate:

  • Lack of cohesive, deployable software
  • Late defect discovery
  • Low-quality software
  • Lack of project visibility

Reduced repetitive manual processes. Reducing repetitive processes saves time, costs and effort. These repetitive processes can occur across all project activities, including code compilation, database integration, testing, inspection, deployment and feedback. By automating Continuous Integration, you have a greater ability to ensure all of the following.

  • The process runs the same way every time.
  • An ordered process is followed. For example, you may run inspections (static analysis) before you run tests-in your build scripts.
  • The processes will run every time a commit occurs in the version control repository.

Generate deployable software at any time and at any place. Continuous Integration can enable you to release deployable software at any point in time. From an outside perspective, this is the most obvious benefit of Continuous Integration. With Continuous Integration, you make small changes to the source code and integrate these changes with the rest of the code base on a regular basis. If there are problems, the project members are informed and the fixes are applied to the software immediately.

Enable better project visibility. Continuous Integration provides the ability to notice trends and make effective decisions, and it helps provide the courage to innovate new improvements. Projects suffer when there is no real or recent data to support decisions, so everyone offers their best guesses. Typically, project members collect this information manually, making the effort burdensome and untimely. The result is that often the information is never gathered. A Continuous Integration system can however provide just-in-time information on the recent build status and quality metrics. Some Continuous Integration systems can also show defect rates and feature completion statuses. Because integrations occur frequently with a Continuous Integration system, the ability to notice trends in build success or failure, overall quality and other pertinent project information becomes possible.

Establish greater confidence in the software product from the development team. Overall, effective application of Continuous Integration practices can provide greater confidence in producing a software product. With every build, your team knows that tests are run against the software to verify behavior, that project coding and design standards are met, and that the result is a functionally testable product. Since a Continuous Integration system can inform you when something goes wrong, developers and other team members have more confidence in making changes. Because Continuous Integration encourages a single-source point from which all software assets are built, there is greater confidence in its accuracy.

Disadvantages

Increased overhead in maintaining the Continuous Integration system. This is usually a misguided perception, because the need to integrate, test, inspect and deploy exists regardless of whether you are using Continuous Integration.

Too much change. Some may feel there are too many processes that need to change to achieve Continuous Integration for their legacy project. An incremental approach to Continuous Integration is most effective; first add builds and tests with a lower occurrence (for example, a daily build), then increase the frequency as everyone gets comfortable with the results.

Too many failed builds. Typically, this occurs when developers are not performing a private build before committing their code to the version-control repository. It could be that a developer forgot to check in a file or had some failed tests. Rapid response is imperative when using Continuous Integration because of the frequency of changes.

Additional hardware/software costs. To effectively use Continuous Integration, a separate integration machine should be acquired.

Developers should be performing these activities. Sometimes management feels that Continuous Integration is just duplicating the activities that developers should be performing anyway. Yes, developers should be performing some of these activities, but they need to perform them more effectively and reliably in a separate environment. Leveraging automated tools can improve the efficiency and frequency of these activities.

References