CSC/ECE 517 Fall 2009/wiki1a 5 rp: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
= | =Introduction= | ||
The defining characteristic of a version control system is its ability | The defining characteristic of a version control system is its ability | ||
Line 6: | Line 6: | ||
control systems have focused on tracking plain text files, such as | control systems have focused on tracking plain text files, such as | ||
those used for programming source code, HTML documents, and various | those used for programming source code, HTML documents, and various | ||
markup syntax. | markup syntax. | ||
The history of the development of version control tools can be roughly | |||
categorized into three main phases: | |||
# Local Version Control | |||
# Client-Server Version Control | |||
# Distributed Version Control | |||
This breakdown is focused on the mechanisms that underly how data is | |||
''shared'' and ''stored'' in a version control system. It should not | |||
be inferred from this structure that other attributes are not | |||
important to the history of developments of version control systems. | |||
There have been many advances in how: | |||
* conflicts are recognized and merges are performed | |||
* groups of logically coherent changes are tracked | |||
* and how the data is stored | |||
We will discuss these each in turn in [[#Other Advances]]. | |||
=Local Version Control= | |||
The first version control systems focused on ''local version | |||
control''; that is, centralized computer systems that were used by | |||
many users, often at the same time. In such a system, there were often | |||
many users of the system and the ''repository'', or location in which | |||
the data was stored, was simply a directory on the server to which the | |||
users had access. Because of this use case, these systems focused on | |||
two main features: | |||
* File revision tracking | |||
* File Checkout and Locking | * File Checkout and Locking | ||
===File Checkout and Locking | We will address each of these fundamental features in turn. | ||
==File Revision Tracking== | |||
The primary feature of these early systems was the ability to check in | |||
files at various points as they were altered, so that the history of | |||
changes made to files under version control was kept permanently. | |||
Thus, many users could alter many files over time, and the entire set | |||
of documents under version control at a given point in time could be | |||
recovered, preventing loss of valuable data, as well a providing a | |||
record of what users made changes to files over time. | |||
==File Checkout and Locking== | |||
Because many users in a shared system may desire to edit a file | Because many users in a shared system may desire to edit a file | ||
Line 33: | Line 66: | ||
most recently. | most recently. | ||
== | ==Weaknesses of Local Version Control Systems== | ||
These local systems had two primary problems. | |||
First, they required that every user log into a single computer to | |||
edit or access the information in the repository. | |||
Second, they restricted a particular file to having only one editor at | |||
any given time. The next development in revision control, embodied by | |||
particular file to having only one editor at any given time. | |||
development in revision control, embodied by | |||
[http://www.nongnu.org/cvs/ CVS], sought to address both of these | [http://www.nongnu.org/cvs/ CVS], sought to address both of these | ||
problems. | problems. | ||
=Networked Revision Control: Client-Server= | |||
As users moved away from logging into systems locally to make their | As users moved away from logging into systems locally to make their | ||
Line 76: | Line 102: | ||
Although CVS supports locking in the same way RCS does, CVS was among | Although CVS supports locking in the same way RCS does, CVS was among | ||
the first version control systems to support a ''non-locking | the first version control systems to support a ''non-locking | ||
repository''. | repository''. This system allowed for concurrent editing of files | ||
under revision control, and generated the need to develop new features | under revision control, and generated the need to develop new features | ||
that addressed the resulting complexities. | that addressed the resulting complexities. Chief among the new | ||
features introduced to handle these complexities were the notions of | features introduced to handle these complexities were the notions of | ||
''branching'' and ''merging''. | ''branching'' and ''merging''. This allowed CVS to offer a non-locking | ||
non-locking repository, which put the "concurrent" in CVS's | repository, which put the "concurrent" in CVS's "concurrent versions | ||
"concurrent versions system". | system". | ||
==Branching and Merging== | |||
Inherent in the notion of concurrent editing is the problem of how to | Inherent in the notion of concurrent editing is the problem of how to | ||
Line 99: | Line 125: | ||
allowed work to progress on multiple fronts simultaneously, only | allowed work to progress on multiple fronts simultaneously, only | ||
requiring that the files be merged once the users of the system were | requiring that the files be merged once the users of the system were | ||
ready to reconcile changes with other users. | ready to reconcile changes with other users. | ||
Along with development of mechanisms to allow this sort of concurrent | |||
access to the repository over the network, version control systems | |||
became more adept in the algorithms they used to detect conflicts and | |||
merge conflicts. This aspect of revision control is discussed further | |||
in [[#Merge Algorithms]]. | |||
=Distributed Revision Control= | |||
=Other Advances= | |||
In addition to the evolution of the way version control systems | |||
allowed users to access, modify and share data in the repository, many | |||
advances have been made in the way changes are merged, tracked and | |||
stored. | |||
==Merge Algorithms== | |||
==Tracking Groups of Changes== | |||
==Repository Data Storage== | |||
=Further Reading= | |||
IBM has an excellent | |||
[http://www.ibm.com/developerworks/java/library/j-subversion/index.html | |||
DeveoperWorks article] describing the ways is which Subversion | |||
improved upon CVS, along with some history of version control up until | |||
Subversion. |
Revision as of 21:16, 6 September 2009
Introduction
The defining characteristic of a version control system is its ability to track changes to a document, or set of documents, over many changes, or revisions. For the vast majority of applications, version control systems have focused on tracking plain text files, such as those used for programming source code, HTML documents, and various markup syntax.
The history of the development of version control tools can be roughly categorized into three main phases:
- Local Version Control
- Client-Server Version Control
- Distributed Version Control
This breakdown is focused on the mechanisms that underly how data is shared and stored in a version control system. It should not be inferred from this structure that other attributes are not important to the history of developments of version control systems. There have been many advances in how:
- conflicts are recognized and merges are performed
- groups of logically coherent changes are tracked
- and how the data is stored
We will discuss these each in turn in #Other Advances.
Local Version Control
The first version control systems focused on local version control; that is, centralized computer systems that were used by many users, often at the same time. In such a system, there were often many users of the system and the repository, or location in which the data was stored, was simply a directory on the server to which the users had access. Because of this use case, these systems focused on two main features:
- File revision tracking
- File Checkout and Locking
We will address each of these fundamental features in turn.
File Revision Tracking
The primary feature of these early systems was the ability to check in files at various points as they were altered, so that the history of changes made to files under version control was kept permanently. Thus, many users could alter many files over time, and the entire set of documents under version control at a given point in time could be recovered, preventing loss of valuable data, as well a providing a record of what users made changes to files over time.
File Checkout and Locking
Because many users in a shared system may desire to edit a file simultaneously, one of the first features developed for version control systems was the ability to check out and lock a file. When a user checks out a file, he or she reserves the right to be the sole editor of that file until it is checked back in to revision control. The first revision control systems designed for use in a shared environment, such as RCS, allowed files to be checked out and locked in this way. Other users could check out the file, but only to view it. Thus, the files were locked from editing by all but the user that had checked the file out most recently.
Weaknesses of Local Version Control Systems
These local systems had two primary problems.
First, they required that every user log into a single computer to edit or access the information in the repository.
Second, they restricted a particular file to having only one editor at any given time. The next development in revision control, embodied by CVS, sought to address both of these problems.
Networked Revision Control: Client-Server
As users moved away from logging into systems locally to make their changes to files, the need for a revision control system that supported remote operations emerged. The natural way to implement such remote operations was as an extension of the existing system, and by far the most prominent manifestation of this philosophy was present in CVS, the concurrent versions system, which was initially based on RCS.
The main feature driving the development of CVS was the need for many users, each on his or her own machine, to be able to perform all the operations present in the original RCS, but over a network connection, and in a way that allowed for concurrent editting to take place. This led to the development of a client-server model of revision control systems, in which one central server would contain the canoncal version of the repository, and various clients could connect to the central server and perform file check outs and commits. This model is very similar to the original RCS model, but rather than requiring users of the system to log into the revision control system locally, it allowed users to access and alter the contents of the repository over the network.
Although CVS supports locking in the same way RCS does, CVS was among the first version control systems to support a non-locking repository. This system allowed for concurrent editing of files under revision control, and generated the need to develop new features that addressed the resulting complexities. Chief among the new features introduced to handle these complexities were the notions of branching and merging. This allowed CVS to offer a non-locking repository, which put the "concurrent" in CVS's "concurrent versions system".
Branching and Merging
Inherent in the notion of concurrent editing is the problem of how to reconcile conflicting changes to the same file. The solution to this problem lay in allowing users of the revision control system to branch a version of the repository and make (possibly many) changes to that branch independent of the changes occurring on the main branch of the repository, known as the trunk. Once a logical set of changes was completed on a branch, that branch would then need to have its changes reconciled with the current state of the repository on the trunk. This process of reconciliation is known as merging.
This feature was critical in a multi-user client environment as it allowed work to progress on multiple fronts simultaneously, only requiring that the files be merged once the users of the system were ready to reconcile changes with other users.
Along with development of mechanisms to allow this sort of concurrent access to the repository over the network, version control systems became more adept in the algorithms they used to detect conflicts and merge conflicts. This aspect of revision control is discussed further in #Merge Algorithms.
Distributed Revision Control
Other Advances
In addition to the evolution of the way version control systems allowed users to access, modify and share data in the repository, many advances have been made in the way changes are merged, tracked and stored.
Merge Algorithms
Tracking Groups of Changes
Repository Data Storage
Further Reading
IBM has an excellent [http://www.ibm.com/developerworks/java/library/j-subversion/index.html DeveoperWorks article] describing the ways is which Subversion improved upon CVS, along with some history of version control up until Subversion.