Five Key Git Concepts Explained the Hard Way

If you’ve ever read a git man page, you’ll know that trying to understand git can be an intimidating experience.

There’s even a git man page generator that produces joke git pages:

If <upstream> is not specified, the upstream configured in
branch.<name>.remote and branch.<name>.merge options will be used 
(see git-config(1) for details) and the --fork-point option is 
assumed. If you are currently not on any branch or if the current
branch does not have a configured upstream, the rebase will abort.
git-land-remote lands some applied remotes over the packed applied branches, and it is in various cases a possibility that a filter-branched error must prevent staged cleaning of some named stages.

One of the above extracts is a joke, one is real…

So here’s five core git concepts explained.

Hopefully after reading this the man pages will start to make more sense. If you’re confused by one I’ve missed, contact me to write it up for you (@ianmiell or LinkedIn).

This post uses the ‘hard way‘ method to teach the concepts by having you type out the commands and think through what’s going on, without having to worry about breaking anything.

I use the same method to teach git in my book Learn Git the Hard Way.    

   learngitthehardway

 

1) Reference

Many will know this already, but I need to make sure you know it because it’s so fundamental.

A ‘reference’ is a string that points to a commit.

There are four main types of reference: HEAD, Tag, Branch, and Remote Reference

HEAD

HEAD is a special reference that always points to where the git repository is.

If you checked out a branch, it’s pointed to the last commit in that branch. If you checked out a specific commit, it’s pointed to that commit. If you check out at a tag, it’s pointed to the commit of that tag.

Every time you commit, the HEAD reference/pointer is moved from the old to the new commit. This happens automatically, but it’s all going on under the hood.

Tag

A tag is a reference that points to a specific commit. Whatever else happens (and unlike the HEAD), that tag will stay pointed at the commit it was originally pointed at.

Branch

A branch is like a tag, but will move when the HEAD moves.

You can only be on one branch at a time.

Type out these commands and explain what’s going on. Take your time:

$ mkdir lgthw_origin
$ cd lgthw_origin
$ git init
$ echo 1 > afile
$ git add afile
$ git commit -m firstcommit
$ git log --oneline --decorate --all --graph
$ git branch otherbranch
$ git tag firstcommittag
$ git log --oneline --decorate --all --graph
$ echo 2 >> afile
$ git commit -am secondcommit
$ git checkout otherbranch
$ git log --oneline --decorate --all --graph
$ echo 3 >> afile
$ git commit -am thirdcommit
$ git log --oneline --decorate --all --graph

Now do it again and explain to someone else what’s going on.

Remote Reference

A remote reference is a reference to code that’s from another repository. See below for more on that…

2)  ‘Detached Head’

Now that you know what HEAD is, then understanding what a ‘detached head’ is will be much easier.

A ‘detached head’ is a git repository that’s checked out but has no branch associated with it.

Continuing from the above listing, type this in:

$ git checkout firstcommittag

You get that scary message:

Note: checking out 'firstcommit'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

git checkout -b <new-branch-name>

HEAD is now at 1b1499c... firstcommit

but if you follow the instructions:

$ git log --oneline --decorate --all --graph
$ git checkout -b firstcommitbranch
$ git log --oneline --decorate --all --graph

you can figure out what’s going on. There was a tag, but no branch at that commit, so the HEAD was detached from a branch.

3) Remote Reference

A remote reference is a reference to a commit on another git repository.

$ cd ..
$ git clone lgthw_origin lgthw_cloned
$ cd lgthw_cloned
$ git remote -v
$ git log --oneline --decorate --all --graph

The log graph looks different doesn’t it?

Compare that to the ​git log output in the other folder and think about how they differ. What word do you see multiple times in the output that you didn’t see before?

The cloned repo has its own copy of the branch (firstcommitbranch) and tag (firstcommit) because that’s where the repository’s HEAD was when you cloned it.

$ git branch -a

shows all the branches visible in this repository, both local and remote.

Compare that to the output of the same command in the original folder. How does it differ?

Now check out your local master:

$ git checkout master

and you get a message saying:

Branch master set up to track remote branch master from origin.
Switched to a new branch 'master'

So you’ve got a local reference master which ‘tracks’ the master in the remote repository. The local reference is master, and the remote reference is origin/master. Git assumed you meant your local master to track the remote master.

The two branches look the same, but they are linked only by the configuration of this repository.

$ cd ../lgthw_origin
$ git checkout master
$ echo origin_change >> afile
$ git commit -am 'Change on the origin'

Then go back to the cloned repository and fetch the changes from the origin:

$ cd ../lgthw_cloned
$ git fetch origin
git log --oneline --decorate --all --graph

Can you see what happened to your local master branch, and what happened to the origin’s? Why are they now separate?

Note that you didn’t git pull the change. git pull does a fetch and a merge, and we don’t want to confuse here by skipping steps and making it look like magic.

In fact, git pull is best avoided when you are learning git…


If you like this post, you’ll like my book Learn Git the Hard Way

It covers all this and much more in a similar style.

learngitthehardway

4) Fast Forward

Your git log graph should have looked like this:

* 90694b9 (origin/master) Change on the origin
* d20fc9a (HEAD -> master) secondcommit
| * 2e7ae21 (origin/otherbranch) thirdcommit
|/ 
* 6c14f2f (tag: firstcommittag, origin/firstcommitbranch, origin/HEAD, firstcommitbranch) firstcommit

(Your ids may differ from the above – otherwise it should be the same.)

Now, do you see how the Change on the origin commit is not branched from your local HEAD/master commit secondcommit – it’s in a ‘straight line’ from the firstcommit tag?

That means that if you ‘merge’ origin/master into your local master, git can figure out that all it needs to do is move the HEAD and master reference to where the origin/master branch is and its ‘merge’ job is done.

$ git merge origin/master
$ git log --oneline --decorate --all --graph

This is all a ‘fast forward’ is: git saw that there’s no need to do any merging, it can just ‘fast forward’ the references to the point you are merging to. Or if you prefer, it just moves the pointers along rather than create a new merge commit.

We just did a git pull, by the way. A git pull consists of a git fetch and a git merge. Breaking it down into these two steps helps reduce the mystery of why things can go wrong.

As an exercise, after finishing this article do the whole exercise again, but make a change to both origin/master and master and then do the fetch and merge to see what happens when a fast-forward is not possible.

5) Rebase

master and origin/master are now in sync, so now run these commands to see what a rebase is:

$ cd ../lgthw_origin 
$ git status
$ echo origin_change_rebase >> afile 
$ git commit -am 'origin change rebase' 
$ git log --oneline --decorate --all --graph 

OK so far? You’ve made a change on master on the origin repo:

$ cd ../lgthw_cloned 
$ echo cloned_change_rebase >> anewfile 
$ git add anewfile 
$ git commit -m 'cloned change rebase in anewfile' 
$ git log --oneline --decorate --all --graph 
$ git fetch origin 
$ git log --oneline --decorate --all --graph 
$ git rebase origin/master 
$ git log --oneline --decorate --all --graph

Can you see what’s happened?

If not, have a close look at the last two git log outputs.

That’s what a rebase is – it takes a set of commits and moves (or ‘re-bases’) them to another commit.

 


If you liked this post, you’ll like my book Learn Git the Hard Way

It covers all this and much more in a similar style.

learngitthehardway


If you liked this post, you might also like these:

Create your own Git diagrams

A Git Serverless Pattern

Power Git Log Graphing

Interactive Git Rebase and Bisect Tutorials


 

Advertisements

14 Replies to “Five Key Git Concepts Explained the Hard Way”

  1. Solid write-up.

    My only big nit so far is how the git docs and tutorials commonly mix terms : “git repository” is the database of commits, and actual directory with some revision of checked out files is usually a “workspace”. They do not necessarily map 1:1 (e.g. repos without a workspace, or many workspaces referring same index to download once and then check out several different branches). It really helps students when we name different concepts with different words.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s