CSC/ECE 517 Fall 2009/wiki1a 4 srhi4: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
Line 5: Line 5:
= Introduction =
= Introduction =


         Source Code Management (also called Revision Control) is a technique used to manage and monitor the codebase of any software in order to track the changes made to the code. It plays an important role in a setup where there are many developers using a codebase who have to modify / work on the same code. The problem gets more complex when there are multiple teams that want to maintain different versions of the same basic codebase to add new features or fix bugs. In such a scenario, merging different versions of the same code and availability of the latest version of the code becomes a critical factor.
         Source Code Management (also called Revision Control) is a technique used to manage and monitor the codebase of any software in order to track the changes made to the code. It plays an important role in a setup where there are many developers using a codebase who have to modify / work on the same code. The problem gets more complex when there are multiple teams that want to maintain different versions of the same basic codebase to add new features or fix bugs. In such a scenario, merging different versions of the same code and availability of the latest version of the code becomes a critical factor.  


The below diagram gives a quick overview of a source code management system managing multiple codebases
The below diagram gives a quick overview of a source code management system managing multiple codebases

Revision as of 04:02, 12 September 2009

Best Practices for Source Code Management with Version Control


Introduction

         Source Code Management (also called Revision Control) is a technique used to manage and monitor the codebase of any software in order to track the changes made to the code. It plays an important role in a setup where there are many developers using a codebase who have to modify / work on the same code. The problem gets more complex when there are multiple teams that want to maintain different versions of the same basic codebase to add new features or fix bugs. In such a scenario, merging different versions of the same code and availability of the latest version of the code becomes a critical factor.

The below diagram gives a quick overview of a source code management system managing multiple codebases

An example of version control
An example of version control

Terminology

• Repository -   Store house for files. Usually integrated with a database
• Trunk -  This is the location where the source code can be found. It is at the root of the tree
• Working set -   Downloaded copy of the codebase
• Client -   Application that connects to the repository
• Server -  The system that hosts the repository
• Code Check-out -  Downloading a file or a set of files from the codebase to a workspace in order to run / modify the code
• Code Check-in -  Uploading new / modified files to a codebase from a workspace
• Branching -  A technique used to aid the concurrent development of software. Simply put, it allows for development of code simultaneously by creating multiple paths for development.A   branch is for a single logical change in the code
• Codeline -  A codeline is similar to a branch but can support multiple logical changes to the code. Often, branch and codeline are used interchangeably
• Merging -  Process of integrating the changes in the code with the codeline. For example, if A branches out from a codeline, development continues on the codeline. A would then have    to merge his / her changes with the codeline to keep the versioning most recent.

Motivation for Source code management (SCM)

         In a development environment, a single project can simultaneously be in multiple phases. One team would be working on building new features into the product. A second team would be working on fixing bugs (Tech support) while a third team concentrates on prototypes for future development. Different lines are created for each of these items – Functional line, development line, maintenance line, release line, integration line and so on. In such an environment, management of the codebase becomes critical for the following reasons

 — Allow simultaneous / parallel development of software. 
 — Integrate code changes from different teams / developers
 — Propagate bug fixes to future versions of the software
 — Isolate, coordinate and tidily separate work item units
 — Track and revert to older versions
 — Keep related projects in sync with one another


Efficient Source Code Management (SCM)

         There are a lot of challenges associated with Source Code Management(SCM) and there exists no standard way for doing SCM. When do you create new branches? When do you create new codelines? Who should be responsible for the branch? When do you merge code? How do you integrate code fixes with future releases? How do you keep track of multiple releases? These are questions that have no definite answer. Each project uses a different SCM technique depending on development cycles, releases, bug fixes and size of the project.

         Fortunately, there a few common guidelines and patterns that allow for efficient management of source code. The following section describes a few common guidelines in a typical scenario

  1. Have a main line from which all the code is derived
     The main line sets up the base of the project. Typically, when some code is being developed for two different platforms, say,  
     Windows and Linux, common modules are placed in the main line. The two codelines for Linux and Windows are then taken
     from the main line to create parallel development lines. 2. Have parallel maintenance and development lines Product development and maintenance (eg: bug fixes) happen simultaneously. For this reason, we need two parallel lines and
     these lines should interact closely and integrate / merge frequently to stay up-to-date with the fixes 3. Have one codeline per release Separation / Isolation of the codeline for a specific release becomes important since parallel work goes on on the previous
     version of the software(maintenance). Thus a new codeline should be drawn whenever a release for the product is planned 4. Create policies for each codeline Policies for a codeline define the stability and maintainability of a source code management system. The policy for the
     development line could encourage late merges while those for bug fixes encourage merging often. 5. Merge early and often As far as possible, developers must be working with the latest copy of the code. Thus, merging often becomes important.
     Early merges help in having common bases for further development since the early parts of the code are most critical. Frequent
     merging also helps in avoiding large merges later in the cycle which could lead to incosistent code. It is the best way to keep
     all developers in sync with the product code.. 6. Do not isolate too much, Isolation could be beneficial in case of large projects. However, creating many codelines / branches is inadvisable. The
     cost for merging could become high. 7. Analyse and realign The versioning tree could grow out of bounds and become wider and wider. Wider the tree, the more codelines / branches it
     has and the more difficult it is to manage. SCM systems must be checked often for such widening trees and action for merging
     codelines must be taken appropriately to sync up versions 8. Have an owner for every codeline Ownership is a key term used in source code management. Every codeline is assigned an owner who is responsible for the codeline.
     The owner's typical tasks would be to assist in code integration and changes in his codeline, clarify ambiguous code policies,
     decide when to freeze and unfreeze code, co-ordinate across teams to make successful merges.


         The above guidelines help in improving manageability of the source code versions. It leads to increased coordination among developers, improves traceability, isolates changes, defines definite roles and responsibilities and reduces complexity.

Example

         IBM Rational ClearCase is a widely used version control software with a secure version management and easy to follow interface. It follows the guidelines of the best practices for source code management with version control. It centralizes the code, making it accessible to anyone in the software development team for a particular development. The ClearCase software supports parallel development and easy merging techniques. One of the most useful features of IBM Rational ClearCase is the views. It provides two different views, SnapShot and Dynamic. SnapShot views help to create local copy of the code from the codebase while the dynamic one is used when the codebase has to be modified (during merge). This feature helps to maintain the version of the code by allowing the developer to choose the level of access required for the code.

         As discussed in the previous section, the ClearCase maintains the main branch as the baseline and when a new project has to be derived from the main branch, a sub-branch is created which uses the baseline. This sub branch is used to build the project on. Every developer who is a part of the project team and for whom the project's main branch is accessible is given a unique id. Whenever a file is checked out by a developer, his name appears on the ClearCase view. The rest of the team can always see which file is being modified / looked at by which team members. This feature sometimes helps the developer to decide whether he should access the file to modify it now or not. Also, the ClearCase maintains every version of a file. Every time a file is checked out, a copy is maintained, so that when the modified version of the file is checked in, the view still has the old version of the file as well. One of the most powerful features of this tool is that every time a project is successfully completed and needs to be released to make it accessible to the other projects (branches of the main branch), it can be merged with the main branch which makes it accessible to all the other branches.

The details of the tool can be found in the IBM publication cited in the References section of this page.

Note - This section describes IBM Rational ClearCase which has an open source redbook. It has been used here as an example just to explain the points better. There are other version control tools like SVN (SubVersion) which is also widely used and well known for the powerful features that it has. The open source document of SVN can be found in the References section of this page.

Conclusion

         Source Code Management or Revision Control is a way of helping the software development for faster and efficient delivery and reduce the amount of rework or manual code maintenance. In a typical development team setup, it becomes a critical tool which is used by every team member. Hence it becomes very important for the version control systems to be accurate and user friendly. Not only does it need to be simple to understand, but it also has to be reliable and transparent without any conflicts.

References

•  For Beginners
      The high level overview of SCM best practices.

•  For Advanced Readers
      An in-depth view of source code management using different patterns. It helps in organizing related lines of development into appropriately diverging and converging streams of source code changes.
      This is the product RedBook released by IBM on IBM Rational ClearCase. It explains the different features of their Version Control tool which follow the best practices discussed in this page.

•  For Developers
      The IBM Rational ClearCase Version Control tool document contains tutorials and examples that are easy to follow.
      Information about SVN and its free download versions by O'Reilly media can be found here.