CSC/ECE 517 Fall 2011/ch1 1f sv

From Expertiza_Wiki
Revision as of 19:47, 5 September 2011 by Svellan (talk | contribs) (→‎Open Source)
Jump to navigation Jump to search

Comparing version - control systems from the programmer's stand point

Introduction : Version Control Systems

Version Control System (VCS) is a software that allows to manage changes of documents, programs, images and other information that is stored in form of computer files. Changes are usually identified by an incrementing number or letter code also known as revision number or revision.The simplest usage of versioning is - you can easily go back to the previous working version of your files, should you mess something up with the latest changes.Changes could range from fixing a typo in a text file up to a huge refactoring in a software project, spanning hundreds of files. Each change usually has name of the person introduced it, time of the change and an optional description message.

Types of Version Control Systems

The version control systems can be classified into three categories:

1. Local version Control

In the local-only approach, all developers must use the same computer system. These software often manage single files individually and are largely replaced or embedded within newer software.

Examples of this approach are:

Revision Control System (RCS) stores the latest version and backward deltas for fastest access to the trunk tip compared to SCCS and an improved user interface, at the cost of slow branch tip access and missing support for included/excluded deltas.

Source Code Control System (SCCS) is a part of UNIX and is based on interleaved deltas, and can construct versions as arbitrary sets of revisions. Extracting an arbitrary version takes essentially the same speed and is thus more useful in environments that rely heavily on branching and merging with multiple "current" and identical versions.

2. Client - Server Model

In the client-server model, developers use a shared single repository.

Open Source

Concurrent Versions System (CVS) was originally built on RCS and licensed under the GPL.

CVS uses a client–server architecture in which clients connect to the server where the current version(s) of a project and its history is stored and "check out" a complete copy of the project, work on this copy and then later "check in" their changes. The client and server can connect over a LAN or over the Internet, or client and server may both run on the same machine if track of the version history of a project with only local developers is required.

Several developers can work concurrently on the same project, each one "checking out" files of the project within their "working copy", and "checking in" their changes to the server. The problem of users stepping on others feet is avoided as the server allows users to "check-in" to the most recent version of the file. Developers are therefore expected to keep their working copy up-to-date by incorporating other people's changes on a regular basis. This task is mostly handled automatically by the CVS client, requiring manual intervention only when an edit conflict arises between a checked-in modification and the yet-unchecked local version of a file. A successful check - in operation, increments the version numbers of all files involved automatically, and the CVS-server writes a user-supplied description line, the date and the author's name to its log files.

Clients can also compare versions, request a complete history of changes, or check out a historical snapshot of the project as of a given date or as of a revision number.

CVS labels a single project (set of related files) which it manages as a module. A CVS server stores the modules it manages in its repository. Programmers acquire copies of modules by checking out. The checked-out files serve as a working copy, sandbox or workspace. Changes to the working copy will be reflected in the repository by committing them. To update is to acquire or merge the changes in the repository with the working copy.

Some of the drawbacks of CVS are

Revisions created by a commit are per file, rather than spanning the collection of files that make up the project or spanning the entire repository.

CVS does not version the moving or renaming of files and directories.

No versioning of symbolic links. Symbolic links stored in a version control system can pose a security risk - someone can create a symbolic link index.htm to /etc/passwd and then store it in the repository; when the "code" is exported to a Web server the Web site now has a copy of the system security file available for public inspection.

Limited support for Unicode and non-ASCII filenames.

No atomic commit. The network and server used should have sufficient resilience that a commit can complete without ever crashing. In many code management processes, development work is performed on branches, and then merged into the trunk after code review - that final merge is 'atomic' and performed in the data center by QA.

Expensive branch operations. CVS assumes that the majority of work will take place on the trunk — branches should generally be short-lived or historical. When used as designed, branches are easily managed and branch operations are efficient and fast.

CVS treats files as textual by default. Text files should be the primary file type stored in the CVS repository. Binary files are supported and files with a particular file extension can automatically be recognized as being binary.

No support for distributed revision control or unpublished changes. Programmers should commit changes to the files often for frequent merging and rapid publication to all users.

Subversion (svn) is an open-source, Apache License versioning control system inspired by CVS and is available on the major operating systems.

SVN provides developers with the following advantages when compared to legacy CVS:

a. All commits are atomic operations.
b. Full revision history is maintained for files renamed/copied/moved/removed.
c. Versioning is maintained for directories, renames, and file metadata which enables developers to move and/or copy entire directory-trees while retaining the entire revision history.
d. Versioning of symbolic links.
e. Branching as a cheap operation, independent of file size. 
f. Files which cannot be merged, are locked by developers which is known as "Reserved checkouts".


[edit]Repository types Subversion offers two types of repository storage — FSFS and Berkeley DB.

Repository types Subversion offers two types of repository storage — FSFS and Berkeley DB. [edit]FSFS FSFS works faster on directories with a large number of files and takes less disk space, due to less logging.[5] Beginning with Subversion 1.2, FSFS is the default data store for new repositories. [edit]Berkeley DB Subversion has some limitations with Berkeley DB usage when a program that accesses the database crashes or terminates forcibly. No data loss or corruption occurs, but the repository is offline while Berkeley DB replays the journal and cleans up any outstanding locks. When using Berkeley DB repository, the only way to use it safely is on the dedicated server and by a single server process running as one user, according to Version Control with Subversion.[6] Existing tools for Berkeley DB repository recovery are not completely reliable, so system administrators need to make frequent repository backups[citation needed]. [edit]Repository access Main article: Comparison of Subversion clients Access to Subversion repositories can take place by the following means: Local filesystem or network filesystem,[7] accessed by client directly. This mode uses the file:///path access scheme. WebDAV/Delta-V (over http or https) using the mod_dav_svn module for Apache 2. This mode uses the http://host/path access scheme or https://host/path for secure connections using ssl. Custom "svn" protocol (default port 3690), using plain text or over TCP/IP. This mode uses either the svn://host/path access scheme for unencrypted transport or svn+ssh://host/path scheme for tunneling over ssh. All three means can access both FSFS and Berkeley DB repositories. Any 1.x version of a client can work with any 1.x server. Newer clients and servers have additional features and performance capabilities, but have fallback support for older clients/servers.[8] [edit]Layers

Internally, a Subversion system comprises several libraries arranged as layers. Each performs a specific task and allows developers to create their own tools at the desired level of complexity and specificity. Fs The lowest level; it implements the versioned filesystem which stores the user data. Repos


These are versioned just like other changes to the filesystem. Users can add any property they wish, and the Subversion client uses a set of properties, which it prefixes with 'svn:'. svn:executable Makes files on Unix-hosted working copies executable. svn:mime-type Stores the Internet media type ("MIME type") of a file. Affects the handling of diffs and merging. svn:ignore A list of filename patterns to ignore in a directory. Similar to CVS's .cvsignore file. svn:keywords A list of keywords to substitute into a file when changes are made. The file itself must also reference the keywords as $keyword$ or $keyword:...$. This is used to maintain certain information (e.g., author, date of last change, revision number) in a file without human intervention. The keyword substitution mechanism originates from rcs[10] and from cvs. svn:eol-style Makes the client convert end-of-line characters in text files. Used when the working copy is needed with a specific EOL style. "native" is commonly used, so that EOLs match the user's OS EOL style. Repositories may require this property on all files to prevent inconsistent line endings, which can cause a problem in itself. svn:externals Allows parts of other repositories to be automatically checked-out into a sub-directory. svn:needs-lock Specifies that a file is to be checked out with file permissions set to read-only. This is designed for use with the locking mechanism. The read-only permission reminds one to obtain a lock before modifying the file: obtaining a lock makes the file writable, and releasing the lock makes it read-only again. Locks are only enforced during a commit operation. Locks can be used without setting this property. However, that is not recommended, because it introduces the risk of someone modifying a locked file; they will only discover it has been locked when their commit fails. svn:special This property is not meant to be set or modified directly by users. As of 2010 only used for having symbolic links in the repository. When a symbolic link is added to the repository, a file containing the link target is created with this property set. When a Unix-like system checks out this file, the client converts it to a symbolic link. svn:mergeinfo Used to track merge data (revision numbers) in Subversion 1.5 (or later). This property is automatically maintained by the merge command, and it is not recommended to change its value manually.[11] Subversion also uses properties on revisions themselves. Like the above properties on filesystem entries the names are completely arbitrary, with the Subversion client using certain properties prefixed with 'svn:'. However, these properties are not versioned and can be changed later. svn:date the date and time stamp of a revision svn:author the name of the user that submitted the change(s) svn:log the user-supplied description of the change(s); [edit]Branching and tagging

SVN employs the inter-file branching model to handle branches.

a. A trunk is a location where the main development occurs. 
b. A branch is a separate line of development, used to isolate changes.
c.Tags refer to a snapshot of the content.

The system sets up a new branch by using the 'svn copy' command, which should be used in place of the native operating system mechanism. Subversion does not create an entire new file version in the repository with its copy. Instead, the old and new versions are linked together internally and the history is preserved for both. The copied versions take up only a little extra room in the repository because Subversion saves only the differences from the original versions. All the versions in each branch maintain the history of the file up to the point of the copy, plus any changes made since. One can "merge" changes back into the trunk or between branches. Due to the differencing algorithm, creating a copy takes very little additional space in the repository.