CSC/ECE 517 Fall 2014/ch1a 15 gs

From Expertiza_Wiki
Revision as of 20:52, 14 September 2014 by Jmgoldsw (talk | contribs)
Jump to navigation Jump to search

Topic: Git is a widely used open source distributed version control system used to manage small as well as large projects. It outclasses SCM tools like Subversion, CVS, Perforce, and ClearCase with features like cheap local branching, convenient staging areas, and multiple workflows. Git supports branching and merging, multiple distributed workflows and is fast.

https://docs.google.com/a/ncsu.edu/document/d/1eq3XHiUUBIrEBx-R2Mgolaw7HttSgzPniwDLEF5oVk0/edit




Git Workflows

Centralized Workflow - The centralized workflow is similar to a central Subversion repository (repo). The developer commits all changes to a single repository, ‘master’ in the case of Git. Git however provides additional features to the user beyond the functionality found in Subversion. Developers are able to work in an isolated environment from all others developing features in the same system. There are times when getting a new feature can unexpectedly cause delays in a developers work because the new feature in some way changes the public interface of code being used by the developer. Often these changes have no direct link to what the developer is working on, but because it was introduced into his/her environment it must be addressed in the code immediately. If the developer is working on a hotfix for a production issue, this could cause undue delay in deploying the hotfix. When using git however all commits occur in the developer’s local environment buffered from any changes being committed by others. Once the developer is at a point where the code they have been working on is stable they can then consider merging in changes made by other developers. The point is that the developer has better control of when to pull in changes made by someone else.

With that said, the Centralized Workflow holds the upstream repository as sacred ground. The central repository should exist on some server that all developers assigned to work on the project should have access and the repository should be created as a bare repository or one that has no working directory associated with it. Note that convention is to name repos that are bare with an extension of .git. In short the repository only ever changes when users push their changes into it.

An additional constraint placed on the central repository is that if the developers local commits diverge from the central repo then the developer cannot just merge his/her changes into the central repo. A merge is simply a 3 way comparison of the two commits and their common ancestor. Developers whose local repo diverges from the central repo must do what is called a rebase in order to be allowed to push their changes back to the central repo. A rebase replays all of the commits from the central repo on top of the developer’s repo. If a conflict occurs they must resolve the conflict and commits those changes. Once the developer rebases his/her local repo it can then be pushed back to the central repo. This push back to the central repo results in a fast forward merge. The end of result of a rebase and merge is the same as a 3 way merge using the merge command without rebasing, however the advantage of requiring the rebase over just a merge is that the fast forward merge results in a linear commit history that is much easier to read and manage over time. The commits in the central repository appear to be linear even though their development may have been parallel and were serialized by their relative local commit order and push.

Feature Branch Workflow - The Feature Branch Workflow builds upon the centralized workflow. In this workflow however developers create a new branch each time they start a new feature or hotfix. There is also a role reversal when it comes to how the commits end up back in the central master branch. When the developer is ready to release their code back to the master branch they make a pull request rather than pushing it.

There a couple of advantages to this approach. Not every feature will involve just one developer. In the centralized workflow there is only one branch that everyone uses. When using git to implement this workflow it had the advantage that each developer was isolated from each other’s work which might cause heartburn due to incompatible changes. The downside to this approach is that developers are isolated from each other. Two developers cannot “share” code without making that code visible to everyone. This poses a significant problem if the code that the developers are working on is some crazy idea that very well may end up being sent to the bit bucket.

The feature branch workflow allows sharing of such code between multiple developers without touching the master branch code. Developers push their feature branch changes back to the server in order to share it with other developers. This allows multiple developers to work on a feature while preserving the pristine state of the master branch.

When a feature reaches the stage where the developer is ready to have it merged back into the master branch, they issue a pull request. A pull request identifies the commit that the target branch should start with. This would likely be the common ancestor of those branches. It also identifies the developers repo (the source repo) and the end commit which if omitted just runs to the end of the commit history. A pull request is really a request for a discussion about the feature in the branch. Developers can review the changes included in the branch. The changes in the branch would either be accepted as is, require modification before they would be accepted, or would be denied.

This type of workflow works very well in a larger environment where a separate change management group controls what code makes it into the master branch. It also is very advantageous when features require more than one developer to implement. Also by pushing feature commits back to the server a backup of commits is effectively made in case a system failure takes out a developer’s system.

Gitflow Workflow - Gitflow is a Workflow proposed by Vincent Driessen. In this workflow there are no additional Git features that are used over the feature branch workflow. It does however prescribe a strict branching model that follows the project release cycle. This model works well in a large project due to the predictable and repeatable way that it handles releases while continuing further development of the system. Even though there are no real new features used in this workflow there a number of open source projects such as nvie/gitflow provided by Vincent Driessen on github.com and the Git Flow Integration plugin for RubyMine (http://plugins.jetbrains.com/plugin/7315) that provide helper scripts and menus to help guide developers to more easily follow the structure provided by workflow.

In the Gitflow Workflow there are a number of persistent and intermittent branches. Specifically there are two persistent branches, master and develop (or development). When you initialize a Gitflow Workflow you will always have these two branches. Intermittently there will exist a release branch and a hotfix branch as well as many feature branches. The feature branches will be named something that indicates what feature will be developed.

The process in a nutshell is as follows. The master branch is initialized. For a new project this branch will be tagged as a beta version or maybe even an alpha version, or if migrating from another version control system this may be the last stable release of the product. The development branch is branched from the master. All developers will then branch their feature branches from the development branch. No real development actually occurs on the development branch. Along with the master branch the development branch should be kept as pristine as possible at all times. This is simply a staging branch for holding completed features. All developer implementation occurs on feature branches. When a developer completes a feature they make a pull request just like in the feature branch workflow. Over time features build up in the development branch and at some point the decision is made to release a version of the software. At this point a release branch is created from the development branch. The release branch is then used to do final testing. No new features are added to the branch. Only fixes to existing features is allowed in a process often called release hardening. As issues are discovered they are fixed in the release branch and eventually after all tests are passed that code then deemed a release. Versions from the release branch may be major or minor releases. It is then merged into the master branch to be built and released to production as well as being merged back into the development branch so that the fixes made during the hardening process are made visible to future development. Once the merges have occurred the release branch is deleted.

The final piece of this puzzle is the hotfix branch. No matter the extent of testing done to software occasionally bugs make it into a production system. When issues are discovered in production that must be fixed immediately a hotfix branch is made from the master branch. The fix is made and tested. It is then merged back into the master branch as well as being merged into the production branch and the branch is deleted as it is no longer needed until the next hotfix. Although it is possible to keep the release and hotfix branches around in between the need for them, but the advantage of deleting them is that they don’t need to be kept up to date in between uses.

Forking Workflow - The Forking Workflow is significantly different than the other types of workflows. In this workflow every developer has a Git repository on the server. That means that every developer has both a private local repository as well as a private remote or server-side repository. Project maintainers control when code is merged into the official repository. Because the nature of this workflow puts each developer’s code base is put in a silo it is ideal and frequently used the workflow of choice for open source projects where there may be untrusted developers contributing code.

When a developer forks a repo a copy of the official repo is made on the server-side for the developer. The act of forking a repo is nothing more than creating a copy of an existing repository so no special functionality is necessary in Git, however Git does support this type of workflow by providing the ability to set more than one remote repository. In the simplest case a developer would fork a repository possibly on repo service like GitHub. He/She would then clone a local version of the repo on their development machine. In order to keep up to date with changes being made in the official repo the developer would set another remote using the git remote add command which by convention is named upstream.

(https://help.github.com/articles/fork-a-repo)

git remote add upstream https://github.com/octocat/Spoon-Knife.git

Once the developer also links his repo to the upstream repo it can then be kept up to date by periodically fetching changes from the upstream repo and merging them into their branch.

(https://help.github.com/articles/syncing-a-fork)

git fetch upstream

Getting the changes into their server-side repos just requires the developer to execute a push.

git push origin master