CSC/ECE 517 Fall 2011/ch1 1f sv: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 11: Line 11:
We shall compare the different VCS based on the below features:
We shall compare the different VCS based on the below features:


  a. '''The repository model:''' states the relationship between the various copies of the source code repository.
  a. ''''The repository model:'''' states the relationship between the various copies of the source code repository.
  b. '''The concurrency model:'''describes how can simultaneous edits to the working copy can be done without stepping on ther users feet.
  b. '''The concurrency model:'''describes how can simultaneous edits to the working copy can be done without stepping on ther users feet.
  c. '''Platform:''' describes operating system that the software supports.
  c. '''Platform:''' describes operating system that the software supports.

Revision as of 17:02, 6 September 2011

Comparing version - control systems from the programmer's stand point

Introduction : Version Control Systems

Version Control System (VCS) is a software that allows to manage changes of documents, programs, images and other information that is stored in form of computer files. Changes are usually identified by an incrementing number or letter code also known as revision number or revision.The simplest usage of versioning is to enable developers to easily go back to the previous working version of their files, should they mess up something with the latest changes. The changes made to the files could range from a minor edit as in correcting a typo or to a huge refactoring in a project, comprising of hundreds of files. Each change usually has name of the person introduced it, time of the change and an optional description message.

We shall look at the different types of Version Control Systems available and a few examples of the same, discussing about their features and limitations from a programmer's stand point.

Types of Version Control Systems

We shall compare the different VCS based on the below features:

a. 'The repository model:' states the relationship between the various copies of the source code repository.
b. The concurrency model:describes how can simultaneous edits to the working copy can be done without stepping on ther users feet.
c. Platform: describes operating system that the software supports.
d. Scope of change, which describes if changes are recorded on a file basis or a directory basis.
e. Version IDs, which describes if version  number is based on hashed content of the file, or sequential numbering of files etc.
f. Atomic commits, which ensures either all the changes are committed or none are.
g. File renames, which describes if the software allows files to be renamed while retaining their version history.
h. Merge file renames, which describes if the software can merge changes made to a file on one branch into the same file that has been renamed on another branch (or vice versa).
i. Symbolic links, which describes if the software allows version control of symbolic links as with regular files. 
j. Merge tracking, which describes if the software tracks the changes which have been merged between respeictive branches and only merges the changes that are missing when merging one branch into another.

The version control systems can be classified into three categories:

1. Local version Control

In the local-only approach, all developers must use the same computer system. These software often manage single files individually and are largely replaced or embedded within newer software.

Examples of this approach are:

Revision Control System (RCS) stores the latest version and backward deltas for fastest access to the trunk tip compared to SCCS and an improved user interface, at the cost of slow branch tip access and missing support for included/excluded deltas.

Source Code Control System (SCCS) is a part of UNIX and is based on interleaved deltas, and can construct versions as arbitrary sets of revisions. Extracting an arbitrary version takes essentially the same speed and is thus more useful in environments that rely heavily on branching and merging with multiple "current" and identical versions.

2. Client - Server Model (Centralized Version Control)

In the client-server model, developers use a shared single repository.

Open Source

Concurrent Versions System

Concurrent Versions System (CVS) was originally built on RCS and licensed under the GPL.It uses a client–server architecture in which clients connect to the server where the current versions of a project and its history is stored and "check out" a complete copy of the project, work on this copy and then later "check in" their changes. The client and server can connect over a LAN or over the Internet, or client and server may both run on the same machine if track of the version history of a project with only local developers is required.

Features of CVS
a. Several developers can work concurrently on the same project, each one "checking out" files of the project within their "working copy", and "checking in" their changes to the server. 
b. The problem of users stepping on others feet is avoided as the server allows users to "check-in" to the most recent version of the file.    
c. Developers are therefore expected to keep their working copy up-to-date by incorporating other people's changes on a regular basis.
d. A successful check - in operation, increments the version numbers of all files involved automatically, and the CVS-server writes a user-supplied description line, the date and the author's name to its log files.

Clients can also compare versions, request a complete history of changes, or check out a historical snapshot of the project as of a given date or as of a revision number.

CVS labels a single project (set of related files) which it manages as a module. A CVS server stores the modules it manages in its repository. Programmers acquire copies of modules by checking out. The checked-out files serve as a working copy, sandbox or workspace. Changes to the working copy will be reflected in the repository by committing them. To update is to acquire or merge the changes in the repository with the working copy.

Limitations of CVS
a. Revisions created by a commit are per file, rather than spanning the collection of files that make up the project or spanning the entire repository. 
b. CVS does not version the moving or renaming of files and directories. 
c. Versioning of symbolic links is not enabled.
d. Limited support for Unicode and non-ASCII filenames. 
e. Commits are not atomic.
f. Branch operations are expensive.

Subversion

Subversion (svn) is an open-source, Apache License versioning control system inspired by CVS and is available on the major operating systems.

Repository types and Branching

SVN offers two types of repository storage :

a. FSFS and 
b. Berkeley DB.

SVN employs the inter-file branching model to handle branches.

a. A trunk is a location where the main development occurs. 
b. A branch is a separate line of development, used to isolate changes.
c. Tags refer to a snapshot of the content.

A new branch is created by using the 'svn copy' command. This creates old and new version copies which are linked together internally and history is perserved for both. The copied versions take up only a small amount of space as only the differences from the original versions are saved in the repository. All the versions in each branch, maintain the history of the file up to the point of the copy, plus any changes made since. One can "merge" changes back into the trunk or between branches.

Features of SVN

SVN provides developers with the following advantages when compared to legacy CVS:

a. All commits are atomic operations.
b. Full revision history is maintained for files renamed/copied/moved/removed.
c. Versioning is maintained for directories, renames, and file metadata which enables developers to move and/or copy entire directory-trees while retaining the entire revision history.
d. Versioning of symbolic links.
e. Branching as a cheap operation, independent of file size. 
f. Files which cannot be merged, are locked by developers which is known as "Reserved checkouts".
Limitations of SVN
a. SVN does not have repository administration and management features.
b. SVN does not store the timestamps of modifications.
c. SVN stores additional copies of data on the local machine which can cause space issues for large projects.
d. SVN does not provide support for merge of file renames.

3. Distributed Version Control

In this model of version control, clients do not check out individual files, but mirror the entire repository. Each developer works directly on their own local repository, and the changes are shared among the repositories in a separate step.

GIT

GIT is a distributed version control system where the emphasis is laid on speed. In this, the working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server. It comes with a simple design and offers strong support for non-linear development. It is fully distributed and can handle large projects like the Linux Kernel efficiently.

GIT differs from the other types of version control systems in the way it thinks about its data. It thinks of its data more like a set of snapshots of mini file system rather than a list of file-based changes.

The Three States

GIT has three states in which a file can reside : Committed, Modified and Staged.

a. Committed: Data is stored safely in the database.
b. Modified: The file has been changed but not yet merged to the database.
c. Staged: A modified file has been marked in its current version to go into the next commit snapshot.
Workflow of GIT
a. Developer modifies files in the working directory.
b. The files are staged, snapshots of which are added to the staging area.
c. The changes are committed, which takes the files from the staging area and the snaphots are stored permanently in the GIT directory.
Features of GIT
a. All commits are atomic operations.
b. GIT provides partial support for file renames. It does not explicitly track file renames as it does not track individual files.
c. GIT provides support for merge file renames.
d. Versioning of symbolic links is enabled.
e. Versioning is done using SHA - 1 hash, which is a 40-character string composed of hex characters and calculated based on the contents of the file or directory structure.
f. GIT knows everything by hash and not by filename.
g. Everytime a commit is performed, GIT takes a snapshot of the files.
h. Files that have not been modified are not copied.
i. All operations like browse history and commit are local.