Skip to content
Thomas Gläßle edited this page May 19, 2017 · 16 revisions

For users not familiar with git, we provide some references and a cheat sheet that will cover the most common use cases:

Getting started

Concepts

Cheat sheet

Setup

Before starting to commit, configure your name and email:

git config --global user.name "Full Name"
git config --global user.email "[email protected]"

Do once the following:

Workflow

In order to contribute patches, proceed as follows:

  • make sure your upstream is up-to-date:

    git fetch upstream
    

    (Note this command just fetches data to be stored into the .git folder but never interferes with your local checkout.)

  • create a branch for your new feature/bugfix. Most of the time you will want to branch off the upstream master:

    git checkout -b my_feature upstream/master
    git push -u origin my_feature
    
  • edit files, build and test, e.g.:

    vim src/twiss.f90
    make madx-linux64-gnu
    make tests
    
  • important: check differences between local files and index:

    git diff
    
  • add relevant changes to the INDEX:

    git add -p src/twiss.f90
    

    (with -p, you can select individual chunks from the file to be added)

  • important: inspect the status and commit added changes to your local repository:

    git diff --cached
    git status
    
  • commit:

    git commit -m "Implement awesome feature XYZ"
    
  • important: inspect your commit:

    git show
    
  • remove the previous commit from the history, but keep all the changes in the working directory:

    git reset
    
  • check your history:

git log                         # commits of current branch
git log --all --graph           # all branches
  • if/when you want, publish your changes to github:

    git push
    
  • go to the github website and create a pull request

  • in case of merge conflicts, see Conflict resolution

  • for history modifications, see History alteration

Comparison with SVN

The following is a non-comprehensive list of the most important concepts in git, mentioning a few differences to SVN. Taking a few minutes to understand this will make your life a lot easier down the road:

  • in git history is structured as a DAG (directed acyclic graph) of commits rather than a linear series of revisions. This means that each commit can have zero or more parents, but there are no commits who are direct or indirect ancestors of itself.
  • git commits are labeled by their SHA1 hash rather than a revision number.
  • a branch or tag in git is a pointer to a commit. In comparison, SVN doesn't have a true builtin concepts of branches but relies on conventions upheld by the user instead. Branches and tags in SVN are created as copies of a directory.
  • the main branch that corresponds to SVN's trunk is conventionally called master in git.
  • git has an index or staging area that determines what goes into the next commit. Changes are added to the index using the git add command. The index defines the tree of the next commit.
  • in git everyone operates on their own repository and commits are created in the local repository (git commit) rather than everyone working and creating commits simultaneously on the same the repository. This means that conflict resolution is only necessary when merging branches, not on every commit.
  • git is a distributed VCS (version control system), meaning that it has a very flexible approach to collaborating with one, many or even no other parties (see Remotes) – as opposed to SVN where you are usually bound to exactly one upstream SVN server.
  • in git users clone the repository (git clone) which means that they obtain not only the checked out source code, but fully-fledged copies of the entire repository including its history and all branches. These clones can function independently of the original repository. In SVN, users checkout (svn co) a source code directory. Accessing the log or switching branches requires network access.
  • You only access the network when you want to synchronize your work, i.e. either publish your commits (git push) or retrieve new commits from the server (git fetch or git pull). This means that in git most operations are very fast compared to SVN.
  • git can be used without any local or remote server. You can turn arbitrary directories into git repositories by typing git init and later add and remove remotes fully dynamically.
  • git has very powerful mechanisms to rewrite history, i.e. one can freely insert, drop, modify and even fuse commits before publishing them. However, you should never do this with commits that have been seen or created by others. Most importantly, never alter commits that have been merged to master.

This is summarized in the following table:

Aspect SVN git
history structure linear DAG
commit id revision number commit hash
tags/branches copy of a directory pointer to a commit
main branch trunk master
staging area no yes
creating commits on server locally
conflict resolution on every commit only when merging
server / remotes exactly one SVN server 0+, fully dynamic
local data checkout (only files, only one branch) fully fledged clone of the entire repository
network access most operations only for synchronizing
history alteration static yes, fully dynamic

You are also strongly encouraged to read more about Git internals.

Details

Getting help

git is very well documented. The man page for every subcommand can be accessed by typing:

git help COMMAND

# e.g.:
git help commit

Note that git also has great official online resources such as tutorials and comprehensive documentation that you should absolutely refer to in case of problems:

Cloning

For users without github account or SSH key:

git clone https://github.com/MethodicalAcceleratorDesign/MAD-X

For users with github account and SSH key:

git clone [email protected]:MethodicalAcceleratorDesign/MAD-X

Note that (unlike SVN) cloning the repository fetches the entire history and creates a fully-fledged repository on which you can work and commit locally without having to push your changes to the upstream repository.

If you plan to contribute, you should fork the repository:

Forking

Login to your github account and fork the MAD-X repository to your own account (using the button somewhere in the upper right). I recommend accessing your fork via SSH (See Adding an SSH key).

Now clone your fork using:

git clone [email protected]:USERNAME/MAD-X

If you had already cloned the upstream repository, you can add your own fork as an additional remote:

git remote add myfork [email protected]:USERNAME/MAD-X

Remotes

As a distributed version control system, git features a very flexible approach to working with multiple parties. This is captured by the concepts of remotes which specify the URLs (or filesystem path) of remote repositories.

List your remotes:

git remote -v           # -v stands for verbose

Add a new remote and fetch data into your .git folder (does not change the current working directory):

git remote add NAME URL
git fetch NAME

Rename an existing remote:

git remote rename OLD NEW

Delete a remote:

git remote rm NAME

For example, if you have cloned the upstream repository, created a few commits, then forked the repository and you now want to add your fork in order to push commits there, you can add it as a new remote like this:

git remote add myfork [email protected]:USERNAME/MAD-X
git fetch

OR:

git remote rename origin upstream
git remote add origin [email protected]:USERNAME/MAD-X
git fetch origin

Branching

While copying a directory on the server side is cheap in SVN as well, the concept of branches is not natively supported by the VCS.

Branching and tagging in git are extremely lightweight operations (creating a pointer).

Conflict resolution

# if you have uncommitted changes (or commit!):
git stash

# pull in the upstream master:
git pull upstream master

# manually resolve conflicts, relevant sections enclosed by
# "<<<<", "====", ">>>>>" lines:
vim src/twiss.f90

# add and commit conflict resolution:
git add src/twiss.f90
git status
git commit
git push

History alteration

  • if you want to reorder/join/drop/fuse/change commits or you have a branch that is based on an old version of master and you want to move (reapply) the commits onto the new master (or another branch):

    git fetch upstream
    git rebase -i upstream/master
    

    In certain situations, you might need:

    git rebase -i --onto upstream/master BASE_POINT BRANCH_NAME
    

    Where BASE_POINT specifies the parent of the first commit to reapply onto the upstream master.

Contributing changes

Settings and aliases

You may find it convenient to add a few basic aliases and settings to your ~/.gitconfig. I encourage you to try and find out how these aliases work and what they do:

[core]
    editor = vim
    excludesfile = /home/thomas/.gitignore_global
    filemode = true

[alias]
    co = checkout
    cp = cherry-pick
    st = status
    re = remote -v

    # fixup current index to the previous commit
    amend = commit --amend
    amenda = commit --amend -a

    # merge and create a structural merge-commit (no fast-forward)
    mm = merge --no-ff

    # useful diffs:
    cdiff = diff --cached
    wdiff = diff --word-diff=color
    wcdiff = diff --cached --word-diff=color

    # log with graph structure:
    alog = log  --graph --all --format=cmedium

    # unstage (remove from index) some files:
    unstage = reset HEAD --

    # checkout pull request by issue number (and remote name):
    copr = "!f() { git fetch -fu ${2:-origin} refs/pull/$1/head:pr/$1 && git checkout pr/$1; }; f"

# enable/specify colors:
[color]
    ui = true
[color "branch"]
    current = yellow reverse
    local = yellow
    remote = green
[color "diff"]
    meta = yellow bold
    frag = magenta bold
    old = red bold
    new = green bold
[color "status"]
    added = yellow
    updated = green
    changed = magenta
    untracked = cyan
    branch = green bold

[diff]
    tool = vimdiff
[difftool]
    prompt = false

# required for the nice graph log:
[pretty]
    cmedium =\
%C(yellow)%h%C(cyan)% an %C(green)(%ar)%C(red)%d%n\
%C(white)%s%n\

Advanced

Git internals

The following is a very brief summary of some of the git internals that may help your understanding of git:

  • git repositories are a collection of objects stored inside the .git repository.
  • git objects consist of some blob of data and are referenced by their SHA-1 hash. The most important objects and their data blobs are:
    • file: the file content
    • tree: list of filenames + references to the corresponding objects
    • commit: - commit message, author, date and other metadata - reference to the file tree - references to parent commit(s)
  • note that this has the following important implications:
    • commits are snapshots that have direct knowledge of the entire working directory, they are not implemented changesets.
    • since hashes are deterministic, commit IDs are deterministic based contents, parents and other metadata, and:
    • tree/file IDs are deterministic based on contents – which means that identical files will only be stored once on disc
    • using a cryptographic hash (more or less):
      • ensures that it is computationally infeasible to generate cyclic histories
      • can be used to verify data integrity
      • allows to detect attempts of tempering
  • git branches are just pointers (references to) to commits. As such, branching is an extremely lightweight operation. (Compare this to SVN, where a branch is a copy of a directory. Here, while creating and switching branches are cheap operations as well, branches can lead to significant overhead when checking out the entire repository including all branches and tags.)
  • git has a garbage collector that can delete objects if they become unreferenced for too long (e.g. more than a week). Objects referenced directly or indirectly by a branch or tag will never be deleted by git.
  • As long as objects are not deleted they can always be checked out or queried otherwise. git reflog is a useful tool to get the hashes of commits that were deleted by accident.

Related topics

Adding an SSH key

If you plan to contribute, I recommend accessing the repository via SSH rather than HTTPS. This means that you will not have to enter your password on every push. If you haven't already, create an SSH key:

ssh-keygen -b 4096

You can leave the password blank to store the key unencrypted (!) to your home folder. In this case, you won't need to enter password when pushing (in my opinion, if someone gets access to your harddrive, all your data is compromised anyway). If you dislike storing the key without further protection, you can enter a password and consider

Copy the SSH public key (cat ~/.ssh/config/id_rsa.pub) and add it on your github settings page.

Clone this wiki locally