CSC/ECE 517 Fall 2010/chd 6d isb
Introduction
Version Control is the the administration and maintenance of a number of files used in a system as it is being developed or as it evolves. Other main purpose of VCS is to keep track of changes in the source code of programs. Dozens of version control systems are available in the market. Most of them are open source and free. First version control system was released in the early 1970s and new tools are being released even today. Briefly, there have been 4 generations of VCS[1]:
- File versioning tools, examples: SCCS, RCS
- Tree versioning tools-central style, example: CVS
- Tree versioning tools-central style: take two, example: Subversion
- Tree versioning tool-distributed style, example: Bazaar, Git, Mercurial
Source Code Control System
The very first version control system[5] is the Source Code Control System, which was originally written by Marc J. Rochkind in 1972 at Bell Labs, NJ. It was designed to help programming projects control changes to source code. SCCS provides facilities for storing, updating and retrieving all versions of modules by version number, and it records: who made software change, when and where it was made as well as the reason of the change. The first two implementations of SCCS were: one for IBM 370 under OS and other one for PDP 11under UNIX. [2][3]
Source Code control System can be categorized under file versioning tools. It only versions individual files. SCCS is an effective method for small projects. One Source file results in one SCCS history file. It is easy to understand format of the history files that allows manual intervention. Also, checksums and forward deltas grant file integrity and immediate detection of corruption. It was the major form of source code control especially on UNIX platforms until the release of Revision Control Systems.
Revision Control System
Revision Control System[7] is the successor to SCCS and written by Walter F. Tichy (now a faculty of University Karlsruhe, Germany) in 1982 at Purdue University. Just like SCCS, RVS is a file versioning tool. The fundamental storage unit is a revision group. Also it supports branches within a file. Unlike SCCS it supports merges. RCS storage on the mainline uses reserve deltas: The latest revision is stored intact but earlier revisions are stored as deltas from the latest. On branches, revisions are stored as forward deltas. Thus checking out branches is slow. For Example:Main line of code has 100 revisions. Assume that there is branch at revision 10. This branch has 80 revisions. To check out the branch tip, RCS must do the followings:
- Retrieve mainline revision 100
- Retrieve and apply reverse deltas from 99 down to 10
- Retrieve and apply 80 forward deltas on the branch
Additionally, in the early 80s, when RCS was released, there were also competitors such as: IBM CLEAR/CASTER, AT&T SCCS, and CMU Software Development control system.
Concurrent Version System
CVS was originally created by Dick Grune of VU University of Amsterdam between 1984 and 1985 and was open sourced in 1986. However the code that eventually evolved into the widely used CVS was started by Brian Berliner in early 1989. By late 1990, with help from several others, the first edition was up for distribution. [9]
CVS , while using RCS underneath, is a lot more powerful tool and can control a complete source code tree. It can be greatly customized with scripting languages like PERL, Korn and bash shells. [10]
CVS offers the following significant advantages over RCS:
- It can run scripts which log CVS operations or enforce site-specific polices.
- CVS enables developers from different geographical location to function as a single team. Information is stored on a single central server and the client machines have a copy of all the files. The client- server connection must be up to perform CVS operations but need not be up to edit or manipulate the current versions of the files.
- It can merge changes from non-CVS vendor branches.
- Allows more than one developer to work on the same file at the same time
- CVS servers run on most OS including unix-variants, Windows and OS/2 etc
Subversion
In early 2000 work began on Subversion by Karl Fogel, the author of Open Source Development with CVS (Coriolis, 1999) and his friend Jim Blandy with the backing of CollabNet, Inc. And by late 2001, it started being deployed. [11]
Subversion is the next-in-line of version control system. CVS has a number of problems, primarily caused by its dependency on the RCS file format for versioning files. These and other issues addressed by Subversion include the following [12] [13] [14] [15]:
- In CVS, atomicity is not guaranteed. Subversion ensures atomicity.
- CVS has no way to rename files and save versioning history. In Subversion, the common history of file1 and file2 is conserved. Additionally, Subversion can be used for versioning a lot of different things. Directories and file metadata, as well as renamed or copied files, all have their own versioning.
- In CVS, branching and tagging are expensive operations for big repositories and directory trees, which have a cost proportional to the number of files being branched or tagged. Subversion has made both branching and tagging constant time operations. They are implemented simply by copying the directory being tagged.
- CVS is not binary file friendly. Any change to a binary file results in the replacement of the old file. Subversion uses a different approach to provide efficient binary diffing which means it can store pdfs and other binary files efficiently.
- If we change a file locally using CVS, and we want to know the difference, then the entire file has to be sent to the server. When we change a file using Subversion repository, a copy of the latest repository revision is made locally. The differences are sent in both directions, which mean a lot less use of bandwidth.
Distributed Revision Control
The first truly distributed version control system was the BitKeeper, which was developed by Linus Torvalds in December 1999. Distributed revision control (DVCS) is a fairly new concept in revision control. With it's peer-to-peer approach, each peer's copy of the code base is a bona-fide repository. It synchronizes by exchanging patches between peers. It's advantages over centralized revision control include [16][17]:
- By default, only the working copy of the code base exist.
- Since, communication with a central server is not required, common operations are much more faster. Peers need to communicate only when pushing or pulling changes with other peers.
- Each working copy exists as a remote backup of the code base.
Open systems are the most recent phenomenon in distributed revision control. They are characterized by their support for independent branches, and heavy reliance on merge operations. It is generally characterized by the following features -
- Peers are free to join as and when they wish without going through any elaborate approval process
- Each working copy works like a branch
- Selective changes can be "cherry-picked", pulling them from specific peers.
One of the first closed source DVCS was the Sun WorkShop TeamWare which was widely used in enterprise settings.[18]
Bazaar is one of the most famous open distributed style tree versioning tool. In addition to Bazaar (Mar,2005), many other distributed version control software are available. They include: Darcs (Nov, 2004), Monotone(Apr, 2003), Mercurial(Apr, 2005), Git(Apr, 2005).
Comparison of version control systems can be found here.
Summary
Using VCS has many benefits. For example: if a team conducts the project, there will be harmony among the team members, and no one will write over other people’s code. Moreover, every change (version) will be stored in VCS repository. This will enable team members to see the differences between versions of the same file, and they will know the time of the changes as well as the responsible team member. Furthermore, Development can be spit into different branches, each branch, lets say keeping track of the fixes associated with a software release. Then file versions can be obtained with a branch and can be applied to the same fix to multiple branches. Another benefit is that, VCS repository can be useful to understand how good the project was: How many lines changed from the previous version? Which are the most and least productive days of the week? Which team member made the most contribution? [8]
References
[1] Introducing Bazaar. Retrieved September, 2010.
[2] Aran, A Brief History of Version Control Systems. Retrieved September, 2010.
[3] Schilling, Jörg, SCCS project. Retrieved September 16,2010.
[4] Rochkind, Marc J., The Source Code Control System.IEEE Transactions on Software Engineering, SE-1 (4), 364-370.
[5] Source Code Control Systems, 2010. Retrieved September, 2010.
[6] Tichy, Walter F., RCS-A system for Version Control.Software—Practice & Experience, 15 (7), 637-654.
[7] Revision control System, 2010. Retrieved September, 2010.
[8] Louridas, Panagiotis, Version Control.IEEE Software, 23 (1), 108–109.
[9] Concurrent Versions System, 2010. Retrieved September, 2010.
[10] Vasudevan, Alavoor, CVS-RCS-HOWTO Document for Linux (Source Code Control System),2003. Retrieved September, 2010.
[11] Collins-Sussman, Ben, Fitpatrick, Brian W. and Pilato, C. Michael, Version Control with Subversion (for Subversion 1.5). O'Reilly Media Inc., California, 2004.
[12] SVN vs CVS, 2005. Retrieved September, 2010.
[13] Neary, David, Subversion Building a better CVS.Linux Magazine, (30), 59-63.
[14] Amitash, Difference between CVS and Subversion. Retrieved September, 2010.
[15] Apache Subversion, 2010. Retrieved September, 2010.
[16] Revision control, 2010. Retrieved September, 2010.
[17] Distributed revision control, 2010. Retrieved September, 2010.
[18] Auvray S., Distributed Version Control Systems: A Not-So-Quick Guide Through, May 07, 2010