CSC/ECE 517 Fall 2009/wiki1a 5 rp: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
 
No edit summary
Line 1: Line 1:
=History Of Version Control=
=History Of Version Control=


==Centralized Version Control==
The defining characteristic of a version control system is its ability
to track changes to a document, or set of documents, over many
changes, or revisions.  For the vast majority of applications, version
control systems have focused on tracking plain text files, such as
those used for programming source code, HTML documents, and various
markup syntax.  Over the past 30 years or so, notions of how to
implement such systems have evolved and become more advanced.


===Local Version Control: RCS===
==Local Version Control==  


===The First Networked Version Control: CVS===
The first version control systems focused on local version control;
that is, centralized computer systems that were used by many users,
often at the same time.  Because of this use case, these systems
focused on two main features:


==Commercial Offerings==
* File Checkout and Locking
* File revision tracking


===ClearCase===
===File Checkout and Locking===  


===Perforce===
Because many users in a shared system may desire to edit a file
simultaneously, one of the first features developed for version
control systems was the ability to check out and lock a file.  When a
user checks out a file, he or she reserves the right to be the sole
editor of that file until it is checked back in to revision control.
The first revision control systems designed for use in a shared
environment, such as RCS, allowed files to be checked out and locked
in this way.  Other users could check out the file, but only to view
it.  Thus, the files were locked from editing by all but the user
that had checked the file out most recently.
 
===File Revision Tracking===
 
The primary feature of these early systems was the ability to check in
files at various points as they were altered, so that the history of
changes made to files under version control was kept permanently.
Thus, many users could alter many files over time, and the entire set
of documents under version control at a given point in time could be
recovered, preventing loss of valuable data, as well a providing a
record of what users made changes to files over time.
 
===Networked Revision Control: Client-Server===
 
As users moved away from logging into systems locally to make their
changes to files, the need for a revision control system that
supported remote operations emerged.  The natural way to implement
such remote operations was as an extension of the existing system, and
by far the most prominent manifestation of this philosophy was present
in CVS, the concurrent versions system, which was initially based on
RCS.
 
The main feature driving the development of CVS was the need for many
users, each on his or her own machine, to be able to perform all the
operations present in the original RCS, but over a network connection.
This led to the development of a client-server model of revision
control systems, in which one central server would contain the
canoncal version of the repository, and various clients could connect
to the central server and perform file check outs and commits.  This
model is very similar to the original RCS model, but rather than
requiring users of the system to log into the revision control system
locally, it allowed users to access and alter the contents of the
repository over the network.
 
The initial need to simply perform RCS-like operations over a network
led to a host of new problems associated with the concurrent editing
of files under revision control, and the need to develop new features
that addressed the resulting complexities.  Chief among the new
features introduced to handle these complexities were the notions of
branching and merging.
 
===Branching and Merging===
 
Inherent in the notion of concurrent editing is the problem of how to
reconcile conflicting changes to the same file.  The solution to this
problem lay in allowing users of the revision control system to branch
a version of the repository and make (possibly many) changes to that
branch independent of the changes occurring on the main branch of the
repository, known as the trunk.  Once a logical set of changes was
completed on a branch, that branch would then need to have its changes
reconciled with the current state of the repository on the trunk.
This process of reconciliation is known as merging.
 
This feature was critical in a multi-user client environment as it
allowed work to progress on multiple fronts simultaneously, only
requiring that the files be merged once the users of the system were
ready to reconcile changes with other users. 
 
This approach presented two new problems, however.
 
First, the merging process was often difficult and time consuming,
leading many developers to avoid checking in their changes and merging
with other users because the merging process interrupted their normal
flow of work.  Users were reluctant to create branches to relieve some
of this burden because branch operations in CVS were quite slow and
cumbersome.
 
Second, because many users checked in changes less frequently, much of
the current work on the repository was stored at the edges of the
network, that is, on the users' workstations, rather than in the
central server.  Thus, while the central server may have been backed
up regularly and kept secure, a user's workstation could crash and
take hours, days or even weeks of work with it, and revision control
would have no record of any of the work that had been lost.
 
Both of these problems could be remedied by proper use of the revision
control system, but there were no technical measures that could be
taken to prevent such abuses.  This left the market open for a
different approach to revision control.
 
==Distributed Revision Control==

Revision as of 17:28, 6 September 2009

History Of Version Control

The defining characteristic of a version control system is its ability to track changes to a document, or set of documents, over many changes, or revisions. For the vast majority of applications, version control systems have focused on tracking plain text files, such as those used for programming source code, HTML documents, and various markup syntax. Over the past 30 years or so, notions of how to implement such systems have evolved and become more advanced.

Local Version Control

The first version control systems focused on local version control; that is, centralized computer systems that were used by many users, often at the same time. Because of this use case, these systems focused on two main features:

  • File Checkout and Locking
  • File revision tracking

File Checkout and Locking

Because many users in a shared system may desire to edit a file simultaneously, one of the first features developed for version control systems was the ability to check out and lock a file. When a user checks out a file, he or she reserves the right to be the sole editor of that file until it is checked back in to revision control. The first revision control systems designed for use in a shared environment, such as RCS, allowed files to be checked out and locked in this way. Other users could check out the file, but only to view it. Thus, the files were locked from editing by all but the user that had checked the file out most recently.

File Revision Tracking

The primary feature of these early systems was the ability to check in files at various points as they were altered, so that the history of changes made to files under version control was kept permanently. Thus, many users could alter many files over time, and the entire set of documents under version control at a given point in time could be recovered, preventing loss of valuable data, as well a providing a record of what users made changes to files over time.

Networked Revision Control: Client-Server

As users moved away from logging into systems locally to make their changes to files, the need for a revision control system that supported remote operations emerged. The natural way to implement such remote operations was as an extension of the existing system, and by far the most prominent manifestation of this philosophy was present in CVS, the concurrent versions system, which was initially based on RCS.

The main feature driving the development of CVS was the need for many users, each on his or her own machine, to be able to perform all the operations present in the original RCS, but over a network connection. This led to the development of a client-server model of revision control systems, in which one central server would contain the canoncal version of the repository, and various clients could connect to the central server and perform file check outs and commits. This model is very similar to the original RCS model, but rather than requiring users of the system to log into the revision control system locally, it allowed users to access and alter the contents of the repository over the network.

The initial need to simply perform RCS-like operations over a network led to a host of new problems associated with the concurrent editing of files under revision control, and the need to develop new features that addressed the resulting complexities. Chief among the new features introduced to handle these complexities were the notions of branching and merging.

Branching and Merging

Inherent in the notion of concurrent editing is the problem of how to reconcile conflicting changes to the same file. The solution to this problem lay in allowing users of the revision control system to branch a version of the repository and make (possibly many) changes to that branch independent of the changes occurring on the main branch of the repository, known as the trunk. Once a logical set of changes was completed on a branch, that branch would then need to have its changes reconciled with the current state of the repository on the trunk. This process of reconciliation is known as merging.

This feature was critical in a multi-user client environment as it allowed work to progress on multiple fronts simultaneously, only requiring that the files be merged once the users of the system were ready to reconcile changes with other users.

This approach presented two new problems, however.

First, the merging process was often difficult and time consuming, leading many developers to avoid checking in their changes and merging with other users because the merging process interrupted their normal flow of work. Users were reluctant to create branches to relieve some of this burden because branch operations in CVS were quite slow and cumbersome.

Second, because many users checked in changes less frequently, much of the current work on the repository was stored at the edges of the network, that is, on the users' workstations, rather than in the central server. Thus, while the central server may have been backed up regularly and kept secure, a user's workstation could crash and take hours, days or even weeks of work with it, and revision control would have no record of any of the work that had been lost.

Both of these problems could be remedied by proper use of the revision control system, but there were no technical measures that could be taken to prevent such abuses. This left the market open for a different approach to revision control.

Distributed Revision Control