<?php include('../HEADER.php'); ?>

<h1>Git version control</h1>

<p>Git is a distributed version control system. Users clone the entire code repository, make any number of changes and commits, and can eventually merge their clone/branch back to the original branch. Git makes branching and merging easy.</p>

<p>Each Git repository has a <code>.git</code> directory, which contains the repo's config, <em>index</em>, and <em>object store</em>. The object store is what gets duplicated when a repo is cloned. The object store contains everything required to reconstruct any version/branch of the project. The index describes the pending commit. Adds, deletes, and edits are recorded and staged in the index before you commit them to the repository. Staging changes in the index allows commits to the repository to be atomic. A commit is just a snapshot of the staged index.</p>

<pre>

        working files
              |
              v
        index/staging
              |
              V
       commit to branch

</pre>

<p>Objects in Git (including commits) are identified by 40-digit hexadecimal numbers like <code>8db571d910c9c4ac93c57ce4359e767f5caf5166</code>. These ID's are SHA1 hases of the objects themselves (i.e.&mdash;the <em>contents</em> of the objects, rather than their file names or whatever). Any two objects with the same hash are identical to each other, even across different repositories.</p>

<h2>Using Git</h2>

<p>Help:<br />
<code>git help [commit add etc]</code></p>

<p>Also, the <a href="http://git-scm.com/book">Pro Git</a> and <a href="http://shop.oreilly.com/product/9780596520137.do">Version Control With Git</a> books are very helpful.</p>

<p>Creating a new repository from an existing project:<br />
<code>cd myproject
git init
git add .
git commit -m "Initial import of existing source."</code></p>

<p>Adding/staging/caching a file:<br />
<code>git add foo.txt [file2 file3 etc]</code></p>

<p>Un-staging a file you decided not to commit:<br />
<code>git rm --cached bad.txt</code><br />
(removes from index/cache, but leaves working copy).</p>

<p>Delete a file (working copy) and stop tracking it:<br />
<code>git rm foo.txt</code></p>

<p>Check the index for changes staged/cached for next commit:<br />
<code>git status</code></p>

<p>Diff all changed files that are <em>not</em> yet staged/cached:<br />
<code>git diff</code><br />
Diff files that have been staged:<br />
<code>git diff --cached</code></p>

<p>Commit staged changes:<br />
<code>git commit</code></p>

<p>You can provide a brief commit message with <code>git commit -m "My message"</code>, or omit the flag to have git drop you into an editor.</p>

<p>Stage and commit any changed files in one step (ignoring files not under version control):<br />
<code>git commit -a</code></p>

<p>Show commit logs (listed from most recent to oldest):<br />
<code>git log [master]</code><br />
The log starts at the supplied commit name; <code>master</code> is the default. The <code>--stat</code> flag shows how many changes were made to each file.</p>

<p>To see details of a particular commit:<br />
<code>git show 8ca531d910c9c4ac935b7ce4859e467f5caf5167</code><br />
(Omitting the hash shows details of the most recent commit.)</p>

<p>Compare two commits by hashes:<br />
<code>git diff 8ca531d910c9c4ac935b7ce4859e467f5caf5167 922c38d910c9b4ac935b7ce4859e467f5caf51b4</code></p>

<p>Restore erroneously removed file:<br />
<code>git checkout HEAD -- foo.txt</code><br />
(HEAD is a reference that always points to the latest commit. A bare double-dash " -- " avoids ambiguity by separating options from a list of arguments.)</p>

<p>Rename a file:<br />
<code>git mv foo.txt bar.txt</code></p>

<p>Deletions and renames take affect upon commit.</p>

<p>Clone a repo locally for a quick experiment:<br />
<code>git clone myproject projectcopy</code></p>

<h2>Git config files and initial post-install setup</h2>

<p>List files (or file patterns) you want git to ignore in a <code>.gitignore</code> file. You can use multiple ignore files in a repo; the ignore file affects files in its directory and subdirectories of that directory (i.e.&mdash;the <code>.gitignore</code> in the root of the repo affects all files in the repository).</p>

<p><code>myproject/.git/config</code> contains setting specific to this repo, and these settings override options in <code>~/.gitconfig</code> and <code>/etc/gitconfig</code>. <code>git config --file</code> settings end up here (if you're in the repository, omit the <code>--file</code> flag, as the local repo is the default for config commands). These settings are <em>not</em> copied when you clone the repo.</p>

<p><code>~/.gitconfig</code> contains per-user settings. <code>git config --global</code> settings end up here.</p>

<p><code>/etc/gitconfig</code> contains settings for the box. Settings in this file are overridden by settings in the above files. <code>git config --system</code> settings end up here.</p>

<p>Show the effective settings with:<br />
<code>git config -l</code></p>

<p>Initial post-install setup:<br />
<code>git config --global user.name "My Name"
git config --global user.email "me@example.com"
git config --global credential.helper cache
git config --global credential.helper 'cache --timeout=3600'
git config --global core.editor vim
git config --global merge.tool vimdiff
</code></p>

<p>Store command aliases in config files. For example:<br />
<code>git config --global alias.show-graph 'log --graph --abbrev-commit --pretty=oneline'</code><br />
so you can thereafter simply type <code>git show-graph</code>.</p>

<h2>Branching and Merging (and Tags)</h2>

<p>The default branch in a repository is named <code>master</code>. Git makes branching and merging easy, so it's not uncommon to casually create branches to tackle a particular bug or develop a new feature.</p>

<p>Branches change all the time. Tags do not. A tag is a signpost in the history of the project, whereas a branch is the road. A tag is a symbolic name for a particular release; a branch is a symbolic name for a line of development. Don't name a branch the same as a tag.</p>

<p><code>git tag</code> shows all the tags in alphabetical order. Create an annotated tag with <code>git tag -a v2.0 -m 'Release version 2.0'</code>, or a lightweight tag like <code>git tag v2.0</code>. Tag a particular past commit by ID with <code>git tag -a v2.0 <i>9fceb02</i></code>. See the details of a tag with <code>git show <i>ef3b43</i></code>.</p>

<p>Show the branches in the project:<br />
<code>git branch</code><br />
(The branch you're working on is highlighted with an asterisk.)</p>

<p>Create a new branch from the current commit:</p>
<code>git branch mynewbranch</code>

<p>Switch to the new branch:<br />
<code>git checkout mynewbranch</code><br />
(Have a clean working directory and switching branches.)</p>

<p>Switch back to the <code>master</code> branch, and merge your changes from mynewbranch:<br />
<code>git checkout master
git merge mynewbranch</code><br />
(Have a clean working directory and index before merging.)</p>

<p>If there are no conflicts, it just works. If there are conflicts, you can see them with:<br />
<code>git diff</code><br />
Edit the files with conflicts, resolve the conflicts, and <code>git add</code> the resolved files.<br />
You can run <code>git mergetool</code> to walk through the conflicts instead of editing the files directly.<br />
Run <code>git status</code> again to verify the conflicts are resolved.</p>

<p>After you resolve any conflicts:<br />
<code>git commit</code></p>

<p>If you're working on a side branch and need to pull in the latest changes from the master branch:<br />
<code>git merge master</code></p>

<p>To see which branches have or have not yet been merged, run <code>git branch</code> with the <code>--merged</code> and <code>--no-merged</code> flag.</p>

<p>Delete a branch (with due caution):<br />
<code>git branch -d mynewbranch</code></p>

<p>Git has a crap-ton of ways to refer to a commit. The 40-hex-digit SHA1 hash of a commit is authoritative and globally unique. Git also creates a few handy references: <code>HEAD</code> points to the latest commit in the current branch, <code>ORIG_HEAD</code> points to the previous HEAD after a merge or reset, <code>FETCH_HEAD</code> points to the HEAD of a remote repo that was just fetched, and <code>MERGE_HEAD</code> points to the <em>other</em> HEAD while a merge is in progress. Git can also refer to commits with relative names: if <code>master</code> is the latest commit, <code>master~1</code> is the penultimate/parent commit, <code>master~2</code> points to the commit before that (grandparent). If a commit has more than one parent, <code>master^1</code> is parent A, <code>master^2</code> is parent B, and so forth. <code>master^3~2</code>, for example, would be the third parent's grandparent.</p>

<p>The <code>gitk</code> tool can, among other things, draw a graph of of the repo.</p>

<h2>Remote Repositories (Push/Pull)</h2>

<p>A remote repo is a clone. Specifying a remote repo just creates a handle linking one clone to its fellow. A repository can have multiple remotes. Remotes need not be on a different machine; they can just be in a different directory on the same filesystem.</p>

<p>Development repos have a working directory with checked out files. <em>Bare</em> repositories do not (they basically just have the contents of the <code>.git</code> directory). If a few developers are working on their own clones repos, and coordinating to a central repository, that central repo should be bare. By convention, bare repositories are names with a <code>.git</code> extenstion, like <code>myprojectdirectory.git</code>.</p>

<p>Creating a new remote git repository:<br />
<code>local$ ssh example.com
remote$ mkdir /home/me/git/myproject
remote$ git init --bare</code></p>

<p>Link your local repo to that remote repository:<br />
<code>local$ cd /home/me/git
local$ git remote add origin ssh://example.com/home/me/git/myproject
local$ git push origin master</code></p>

<p>Push local changes to remote repository:<br />
<code>cd myproject
git push ssh://example.com/home/me/git/myproject master</code></p>

<p>Update your existing local repository with any changes from remote:<br />
<code>git pull ssh://paulgorman.org/home/paulgorman/git/blelo master</code></p>

<p>Grab a fresh local copy from remote server:<br />
<code>git clone ssh://example.com/home/me/git/myproject</code><//p>

<p>Push your new branch upstream:<br />
<code>git push -u origin mynewbranch</code><br />
(The -u flag sets origin as the <em>upstream default</em> for mynewbranch.)</p>

<p>Pull a new branch from upstream:<br >
<code>git fetch origin anewbranch</code></p>

<p>If git complains about <br/ >
<code>There is no tracking information for the current branch.
Please specify which branch you want to merge with.</code><br />
run <br />
<code>git pull origin master
git push -u origin master</code></p>

<h2>Finding Problems with blame, pickaxe, or bisect</h2>

<p>To find who last modified a particular line and when it was modified:<br />
<code>git blame -L 1427, mydir/foo.py</code></p>

<p>To find when a string changed (was added or deleted) in a file:<br />
<code>git log -Smystring mydir/foo.py</code><br />
(The <code>-S</code> flag is known as <em>pickaxe</em>. Note that there is an edge case where pickaxe will not work: if a commit had exactly the <em>same number</em> of additions and deletions of the string in the file.)</p>

<p><code>git bisect</code> is a way to find a faulty commit or regression. To use it, you need to know a bad commit (often your current <code>master</code>, where you noticed the problem) and a good commit free of the problem (perhaps the last major release version).</p>

<p>Be sure to start with a clean working directory.</p>

<pre>cd myproject
git bisect start
git bisect bad        <i># The default commit, HEAD, is the bad end of the range</i>
git bisect good v3.0-release        <i># Define a known good start of the range</i>
</pre>

<p>Test to see if the fault is present in this version. For each version that is free from the fault, tell git:<br />
<code>git bisect good</code><br />
or, if this version is still faulty:<br />
<code>git bisect bad</code></p>

<p>Each time you tell git <code>good</code> or <code>bad</code>, git will halve the commit number to narrow in on where the problem was introduced. It keeps a log with:<br />
<code>git bisect log</code></p>

<p>When you've found the problem, tell git you're finished by resetting to your original branch:<br />
<code>git bisect reset</code><br />
(You should now see <code>* master</code> if you run <code>git branch</code>.)</p>

<h2>Links</h2>

<ul>
    <li><a href="http://book.git-scm.com/">Git Community Book</a></li>
    <li><a href="http://www.spheredev.org/wiki/Git_for_the_lazy">Git for the lazy</a></li>
    <li><a href="http://toolmantim.com/articles/setting_up_a_new_remote_git_repository">Setting up a new remote git repository</a></li>
    <li><a href="http://www.kernel.org/pub/software/scm/git/docs/everyday.html">Everyday GIT With 20 Commands Or So</a></li>
    <li><a href="http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html">gittutorial(7) man page</a></li>
</ul>

<?php include('../FOOTER.php'); ?>
