Skip to content
Thomas Gläßle edited this page May 19, 2017 · 16 revisions

For users not familiar with git, we provide some references and a cheat sheet that will cover the most common use cases:

Concepts

Git is a distributed VCS (version control system), meaning that it has a very flexible approach to collaborating with one, many or even no other parties (remotes). It is conceptually very different from the likes of CVS or SVN. Please forget everything that you know about SVN and carefully read the following table to understand the key concepts in git.

Concept Meaning
work-tree the files actually present in your filesystem (at the top level where also the .git folder is)
repository storage of objects and references to objects (stored in .git/ folder)
reference SHA-1 hash of the object payload
object can be file, tree or commit
file payload: contents of a file
tree payload: list of names, plus references to the corresponding files and subtrees
commit payload: message, author, date, reference to tree, reference to parent(s), other metadata
history set of all commits that are ancestors of a commit (DAG = directed acyclic graph)
branch reference to commit
index (staging area) virtual tree that will become the tree of the next commit
stash temporary storage for changes, can be stacked
remote a related repository located elsewhere on the network or filesystem
merge joining multiple branches

At this point, take a few minutes and think about the following questions:

  • what is a repository and how does it differ from a working-tree?
  • will all changes in the work-tree automatically be part of the next commit?
  • is it an expensive operation to create a branch?
  • do all clones of a repository have equal rights or is there an upstream repository that plays a special role?
  • is it possible to have remote repositories that have no commits in common?
  • why should the history be a DAG and how is this property enforced on a technical level?
  • is a commit represented internally as a snapshot of the files or as a diff?
  • if a file with the same data is present in two commits, is it stored twice on disk?

You can check your understanding by referring to the section about Git internals.

Cheat sheet

Rather than enforcing one opinionated work-flow, git has a lot of commands that can be used in a very modular fashion. While this may be confusing at first, you will find that most of it is quite natural to learn. We list the most common ones here.

Note: Most git commands are safe, i.e. revertible. This means that they will either keep your changes in the working directory or fail, depending on the command and whether there will be conflicts between the local changes and the performed action.

Command Purpose
Basic commands for working with a single repository:
init create empty repository (.git/) in current directory
add copy changes: work-tree → index
rm remove file from index + work-tree
commit create commit: index → commit
branch manage branches (create, delete, rename, move)
tag manage tags (releases)
merge merge two or more branches: branch+ → commit
revert create a commit to cancel the diff of a previous commit (safe)
config change settings
Informational commands:
help get help about a topic or command
status show general status
log show history
diff show diff between work-tree, index, commits
show show info about a commit
blame show who commited what, when
bisect find the commit that introduced a bug
Commands that can be dangerous:
reset usually: clear index, move current branch pointer (safe) OR:
reset --hard also: overwrite local changes (DANGEROUS)
checkout COMMIT move HEAD and checkout files (safe!) OR:
checkout BRANCH also: switch branches (safe!) OR:
checkout -- FILE also: overwrite local changes (DANGEROUS)
stash save local changes + reset --hard
stash drop delete an item from the stash (DANGEROUS)
Working with remotes (network access):
fetch download data: remote → .git/
push upload data: .git/ → remote
pull fetch + merge
clone init + remote add + fetch + checkout
Interact with foreign repositories:
remote manage remotes (add, rm, rename, set-url)
submodule manage submodules (add, rm, update)
subtree manage subtrees
Editting history:
commit --amend index → previous commit
cherry-pick apply existing commit at current location
rebase rewrite history (reorder, modify, merge, insert or drop commits)
filter-branch apply changes to all existing commits

Note: These commands are examples of so called porcelain commands, meaning they are comfortable for regular use. There is another set of so called plumbing commands that allow a lower level access to git internals. These can be useful in certain uncommon situations, for example when doing automated history rewrites or when writing your own subcommands.

Workflow (MAD-X)

We employ the following workflow for MAD-X:

  1. Fork the official MAD-X repository onto your own github username
  2. Create a new branch for your feature/bugfix
  3. Work on your branch until the feature/bugfix is ready for inclusion
  4. Push your changes to your github
  5. Create a pull-request for inclusion into the official MAD-X repository

Fork the official MAD-X repository

As a very first step, configure your name and email:

git config --global user.name "Full Name"
git config --global user.email "[email protected]"

Now fork the MAD-X repository onto your own github username. If you're lost, see Forking.

Clone your fork and add the official MAD-X repository as upstream:

git clone [email protected]:USERNAME/MAD-X
cd MAD-X
git remote add upstream [email protected]:MethodicalAcceleratorDesign/MAD-X

You're now ready to work on your local copy of the MAD-X repository.

Create a new branch

First, make sure that your upstream is up-to-date:

git fetch upstream

(Note this command just fetches data to be stored into the .git folder but never interferes with your local checkout.)

Now create and checkout a new branch for your new feature/bugfix. Most of the time you will want to branch off the upstream master:

git checkout -b my_feature upstream/master

Associate the branch to a branch on your github fork:

git push -u origin my_feature

Work on your branch

After making a coherent unit of changes, build and test your changes, and then add relevant changes to the INDEX:

git add FILES...

Note that with the -p or -e options, you gain more fine grained control over what goes into the index.

Commit changes to your current branch in the local repository:

git commit -m "Implement awesome feature XYZ"

The commit messages describes the change of the commit, starts with a verb and an uppercase letter. The first line should be no more than 80 characters. If you want to provide a more information (which is very welcome!), leave an empty line after the subject line, e.g.:

Fuse twcpgo and twchgo routines

Tracking *common* and *chromatic* optical functions independently was more
error prone (no "single source of truth") and meant that many computations
had to be performed twice.

Resolves #735

The Resolves #NUM line can be used to automatically close issues on github upon merge.

Important: Regularly inspect your status, diff, log and last commit if they contain exactly the changes you intended, see Inspection.

Push your changes

If/when you want, publish your changes to github:

git push

Create a pull-request

Go to the github website and create a pull request. This creates a thread where further review and discussion can take place.

In case of merge conflicts, see Conflict resolution.

For history modifications, see History alteration.

Common operations

Getting help

git is very well documented. The man page for every subcommand can be accessed by typing:

git help COMMAND

Note that git also has great official online resources such as tutorials and comprehensive documentation that you should absolutely refer to in case of problems:

Inspection

There are multiple ways to query information about the current branch/work-tree/commit/history and so on. This is especially important directly before and after committing:

command what it shows
git status general status
git diff differences between local files and index
git diff --cached differences between index and previous commit
git show [COMMIT] commit message + diff
git log [BRANCH] commit history of branch
git log --all commit history of all branches
git log --graph graph structure of history
git blame FILE which line was changed last when, in which commit and by whom

Cloning

For users without github account or SSH key:

git clone https://github.com/MethodicalAcceleratorDesign/MAD-X

For users with github account and SSH key:

git clone [email protected]:MethodicalAcceleratorDesign/MAD-X

Note that (unlike SVN) cloning the repository fetches the entire history and creates a fully-fledged repository on which you can work and commit locally without having to push your changes to the upstream repository.

If you plan to contribute, you should fork the repository:

Forking

Login to your github account and fork the MAD-X repository to your own account (using the button somewhere in the upper right). I recommend accessing your fork via SSH (See Adding an SSH key).

Now clone your fork using:

git clone [email protected]:USERNAME/MAD-X

If you had already cloned the upstream repository, you can add your own fork as an additional remote:

git remote add myfork [email protected]:USERNAME/MAD-X

Remotes

As a distributed version control system, git features a very flexible approach to working with multiple parties. This is captured by the concepts of remotes which specify the URLs (or filesystem path) of remote repositories.

List your remotes:

git remote -v           # -v stands for verbose

Add a new remote and fetch data into your .git folder (does not change the current working directory):

git remote add NAME URL
git fetch NAME

Rename an existing remote:

git remote rename OLD NEW

Delete a remote:

git remote rm NAME

For example, if you have cloned the upstream repository, created a few commits, then forked the repository and you now want to add your fork in order to push commits there, you can add it as a new remote like this:

git remote add myfork [email protected]:USERNAME/MAD-X
git fetch

OR:

git remote rename origin upstream
git remote add origin [email protected]:USERNAME/MAD-X
git fetch origin

Branching

While copying a directory on the server side is cheap in SVN as well, the concept of branches is not natively supported by the VCS.

Branching and tagging in git are extremely lightweight operations (creating a pointer).

Conflict resolution

# if you have uncommitted changes (or commit!):
git stash

# pull in the upstream master:
git pull upstream master

# manually resolve conflicts, relevant sections enclosed by
# "<<<<", "====", ">>>>>" lines:
vim src/twiss.f90

# add and commit conflict resolution:
git add src/twiss.f90
git status
git commit
git push

History alteration

  • if you want to reorder/join/drop/fuse/change commits or you have a branch that is based on an old version of master and you want to move (reapply) the commits onto the new master (or another branch):

    git fetch upstream
    git rebase -i upstream/master
    

    In certain situations, you might need:

    git rebase -i --onto upstream/master BASE_POINT BRANCH_NAME
    

    Where BASE_POINT specifies the parent of the first commit to reapply onto the upstream master.

Contributing changes

Related topics

Git internals

The following is a very brief summary of some of the git internals that may help your understanding of git:

  • git repositories are a collection of objects stored inside the .git repository.
  • git objects consist of some blob of data and are referenced by their SHA-1 hash. The most important objects and their data blobs are:
    • file: the file content
    • tree: list of filenames + references to the corresponding objects
    • commit: - commit message, author, date and other metadata - reference to the file tree - references to parent commit(s)
  • note that this has the following important implications:
    • commits are snapshots that have direct knowledge of the entire working directory, they are not implemented changesets.
    • since hashes are deterministic, commit IDs are deterministic based contents, parents and other metadata, and:
    • tree/file IDs are deterministic based on contents – which means that identical files will only be stored once on disc
    • using a cryptographic hash (more or less):
      • ensures that it is computationally infeasible to generate cyclic histories
      • can be used to verify data integrity
      • allows to detect attempts of tempering
  • git branches are just pointers (references to) to commits. As such, branching is an extremely lightweight operation. (Compare this to SVN, where a branch is a copy of a directory. Here, while creating and switching branches are cheap operations as well, branches can lead to significant overhead when checking out the entire repository including all branches and tags.)
  • git has a garbage collector that can delete objects if they become unreferenced for too long (e.g. more than a week). Objects referenced directly or indirectly by a branch or tag will never be deleted by git.
  • As long as objects are not deleted they can always be checked out or queried otherwise. git reflog is a useful tool to get the hashes of commits that were deleted by accident.

Comparison with SVN

The most important differences between git and SVN are summarized by the following table:

Aspect SVN git
history structure linear DAG
commit id revision number commit hash
tags/branches copy of a directory pointer to a commit
main branch trunk master
staging area no yes
creating commits on server locally
conflict resolution on every commit only when merging
server / remotes exactly one SVN server 0+, fully dynamic
local data checkout (only files, only one branch) fully fledged clone of the entire repository
network access most operations only for synchronizing
history alteration mostly static yes, fully dynamic

Adding an SSH key

If you plan to contribute, I recommend accessing the repository via SSH rather than HTTPS. This means that you will not have to enter your password on every push. If you haven't already, create an SSH key:

ssh-keygen -b 4096

You can leave the password blank to store the key unencrypted (!) to your home folder. In this case, you won't need to enter password when pushing (in my opinion, if someone gets access to your harddrive, all your data is compromised anyway). If you dislike storing the key without further protection, you can enter a password and consider

Copy the SSH public key (cat ~/.ssh/config/id_rsa.pub) and add it on your github settings page.

Settings and aliases

You may find it convenient to add a few basic aliases and settings to your ~/.gitconfig. I encourage you to try and find out how these aliases work and what they do:

[core]
    editor = vim
    excludesfile = /home/thomas/.gitignore_global
    filemode = true

[alias]
    co = checkout
    cp = cherry-pick
    st = status
    re = remote -v

    # fixup current index to the previous commit
    amend = commit --amend
    amenda = commit --amend -a

    # merge and create a structural merge-commit (no fast-forward)
    mm = merge --no-ff

    # useful diffs:
    cdiff = diff --cached
    wdiff = diff --word-diff=color
    wcdiff = diff --cached --word-diff=color

    # log with graph structure:
    alog = log  --graph --all --format=cmedium

    # unstage (remove from index) some files:
    unstage = reset HEAD --

    # checkout pull request by issue number (and remote name):
    copr = "!f() { git fetch -fu ${2:-origin} refs/pull/$1/head:pr/$1 && git checkout pr/$1; }; f"

# enable/specify colors:
[color]
    ui = true
[color "branch"]
    current = yellow reverse
    local = yellow
    remote = green
[color "diff"]
    meta = yellow bold
    frag = magenta bold
    old = red bold
    new = green bold
[color "status"]
    added = yellow
    updated = green
    changed = magenta
    untracked = cyan
    branch = green bold

[diff]
    tool = vimdiff
[difftool]
    prompt = false

# required for the nice graph log:
[pretty]
    cmedium =\
%C(yellow)%h%C(cyan)% an %C(green)(%ar)%C(red)%d%n\
%C(white)%s%n\

Clone this wiki locally