Skip to content

Instantly share code, notes, and snippets.

@fpl9000
Last active April 22, 2025 23:12
Show Gist options
  • Save fpl9000/1a67aebdb0237a13fa15 to your computer and use it in GitHub Desktop.
Save fpl9000/1a67aebdb0237a13fa15 to your computer and use it in GitHub Desktop.
Git Overview

Git Overview

Copyright © 2025 Francis Litterio, [email protected]

This is a technical overview of Git concepts and commands. It is structured as a tutorial, where each section builds on previous sections.

This document is under active development. The following topics remain to be covered:

  • Monorepos vs submodules.
  • Sparse-checkout.
  • Detached HEAD.
  • Reflogs.
  • Rebasing.
  • Squashing.

Contents

Background

Git History

Git was invented in 2005 by Linus Torvalds, who created the Linux kernel. Git was developed to replace BitKeeper, the previous revision control system for the Linux kernel.

Git is an open source project under active development. It is the world's most widely-used revision control system. As of 2022, nearly 95% of developers report using it.

Git is available as both a command-line tool and a graphical tool. This document focuses on the command-line tool, as it is the most widely used.

Some older Git commands have been superceded by newer commands, usually for the purpose of simplifying complex commands, but the older commands continue to work. Where there are multiple commands to do the same task, this document describes all of them.

Git Documentation

The Git documentation includes a a tutorial, reference manual, and videos. You can find all of these at https://git-scm.com/doc.

If you're new to Git, a good place to start is the book Pro Git, which can be read online at https://git-scm.com/book.

If you already know how to use Git, the Git reference manual is at https://git-scm.com/docs. Beware that the reference manual is very terse and does not present information in a tutorial format. It assumes you know Git already and just need to look up the details you don't remember.

The Git reference manual is also available interactively using command git help ..., as follows:

  • git help shows a brief usage summary of the git command.
  • git help git shows a high-level overview of Git.
  • git help -a shows a one-line summary of every Git command.
  • git help COMMAND shows help for any of the commands listed by git help -a.
  • git help -g shows a list of Git concept guides.
  • git help CONCEPT shows help for any of the concepts listed by git help -g.

On UNIX/Linux systems, every Git command has a man page named git-COMMAND, where COMMAND is the Git command name.

  • For example, enter command man git-status to read the man page for Git's status command.
  • The man page for git-COMMAND is identical to the help shown by git help COMMAND, so use whichever you prefer.

File Names in Git Commands

Many Git commands accept one or more file names as arguments at the end of the command line. If the names of the files appear to be Git commands, command-line switches, commit identifiers, or any other special Git command syntax, Git fails to treat those arguments as file names.

The solution is to add the argument -- before the file names. For example, git diff -- myfile.cpp. The -- tells Git that all subsequent arguments are file names.

For clarity, this document will use -- to indicate the start of file names, but you only need to use -- if Git fails to recognize the file names you provide.

Basic Concepts

This section covers basic concepts about Git and its most commonly used commands. Following this section is the Advanced Concepts section, which covers more complex topics that require an understanding of these basics.

The Repository

A Git repository (commonly called a repo) is a directory tree containing a mixture of revision controlled files and other files. This directory can be located anywhere on your hard drive. You can even move it after it has been created.

The revision controlled files in a repo are tracked by Git. All other files in a repo are untracked.

  • Tracked and untracked files exist alongside each other in the repo.

  • You can't tell a tracked file from an untracked file just by looking at it. Use command git status FILENAME to see if the specified file is tracked or untracked.

  • Git generally ignores untracked files, but see section Ignored Files for situations where Git operates on untracked files and how you can prevent this.

Directory .git in the root of the repo contains Git metadata about the repo. Very rarely, you might look in the .git directory, but most of the time you should ignore it. Never change anything in the .git directory!

A repo contains one or more branches. Each branch contains a different history of changes to the tracked files in the repo. A branch can contain tracked files and directories that don't exist in other branches.

  • Every branch has a name. The main branch is named master by default.

  • You can rename a branch at any time, but section Renaming a Branch will show that doing so can make work for your collaborators and should be avoided.

  • Unlike some other revision control systems, Git branches are not stored in separate directories. Instead, you only see the tracked files from one branch at a time. You must explicitly switch between branches to see and work on files in a different branch.

Branches are discussed in detail in section Branches.

How Git Stores Tracked Files

In each branch, Git stores the tracked files in the following places:

WORKING FILES  <->  INDEX (or STAGING AREA)  <->  LOCAL REPO  <->  REMOTE REPO

Above, the bidirectional arrows show how Git's workflow copies your changes between these places.

Important

A given file can have different contents in each of these places. It can even exist in some of these places but not in others. All of these places exist in each branch, so it's important to understand this workflow.

Let's look at how each of these places is used and when your changes are copied between them.

Working Files

Working files are tracked files that you can work on. You create, modify, and delete working files.

  • Only one branch's tracked files are visible at a time.

  • To switch to a different branch, you checkout the other branch with command git checkout BRANCHNAME or the newer command git switch BRANCHNAME.

A checkout changes the working files to the other branch's tracked files, except for any modified working files — so be careful not to have any when you do a checkout, otherwise you'll be left with a mixture of tracked files from different branches. Git warns you if this happens. The solution is described in the next section, The Index.

Note

Git's use of the term checkout differs from what most revision control systems mean by that word. Other revision control systems use checkout to mean obtain permission to modify a file, but Git is a decentralized revision control system, so there is no central authority to grant that permission. Anyone can modify any file in their Git repo.

Switching to another branch can cause any of the following things to happen:

  • The contents of some working files change.
  • Some working files disappear.
  • Some new working files appear.
  • Directories containing working files may also appear and disappear.

Git cannot track an empty directory. It only tracks files. If you want to track an empty directory, create a tracked file in it as a placeholder (section The Index describes how to create tracked files).

If you switch branches frequently, you can avoid switching between branches by having a separate copy of the repo for each branch. All branches still exist in all copies of the repo, but each copy of the repo has a different branch checked out at all times.

The Index

The index (also known as the staging area) is a compressed copy of all tracked files in the current branch. When you checkout another branch, the index changes just as the working files do.

  • The index seems redundant. Why have two copies of the tracked files: the working files and the index?

  • Because the working files are for you to create, modify, and delete. The index is Git's copy of the current branch's tracked files. Also, the index is where you gather related changes so they appear as a single change in the branch's history.

  • The index is stored in the .git directory, along with the rest of the metadata for your repo.

As you make changes to the working files, you will eventually stage the modified, new, and deleted working files into the index. You give your changes to Git by staging them into the index, as follows:

  • To stage new or modified files, use command git add -- FILE1 FILE2 ....

  • To stage the deletion of tracked files, use command git rm -- FILE1 FILE2 ..., which removes both the working files and the versions in the index. This will permanently destroy your changes in those files, so be careful!

  • To rename a tracked file, use command git mv OLDNAME NEWNAME, which renames the working file and stages the rename in the index. This is similar to git rm OLDNAME followed by git add NEWNAME, except it preserves the file's history.

Staged changes wait in the index to be part of a future commit into the current branch in the repo (commits are described in section Committing to a Local Repo). The typical workflow is to stage a series of related changes into the index, then later commit them all at once to the local repo.

  • A file can be staged repeatedly. A more recently staged file replaces the previously staged copy in the index.

  • After a file is staged, the copy in the working files and the copy the index are identical, until you make more changes to the working file.

For convenience, files can be staged and committed in one operation. However, it is recommended that you stage many related changes and then make a single commit to keep the repo's commit history uncluttered. This also simplifies reverting changes.

Note

The index is often called the staging area, but don't interpret this to mean the index contains only staged files. It contains a compressed copy of all your tracked files, including the staged changes.

Earlier, you learned that if you have modified working files in branch A and you checkout branch B, the modified files do not change, leaving you with a mixture of tracked files from both branches. Git warns you if this happens. The solution is simple: checkout branch A and stage all the modified files into the index. After this, when you checkout branch B, all of the working files will be from branch B. If you're not ready to stage your changes in branch A, see section Stashing for an alternative solution.

The Local Repo

The local repo is a compressed copy of all commits in the branch along with the history of all changes to each file.

You commit your staged changes with command git commit. This copies all the staged changes to the local repo. See section Committing to the Local Repo for more details about committing changes.

  • This two-step process of staging changes to the index then committing staged changes to the local repo is different from most other revision control systems, where changes are submitted using a single check-in operation.

  • The local repo exists on the computer where you do your work. Git is a distributed revision control system, so each collaborator's local repo contains a copy of every commit in every branch.

  • The local repo (and the index) are accessed through the filesystem, which means you can stage and commit changes offline. You only need to be connected to a network to copy your commits to the remote repo, which is covered in the next section.

As contributors do work in parallel, their local repos accumulate new commits that don't yet exist in others' local repos. Sections Pushing Commits and Pulling/Fetching Commits describe how collaborators' local repos are synchronized to contain the same set of commits.

Caution

The local repo is stored in the .git directory, but it is common (including in this document) to use the term local repo to mean the entire directory containing all of the working files plus the .git directory where the index and local repo reside. Usually you can intuit which meaning of local repo is intended from the context in which it is used.

Relationships Between Repositories

The previous sections describe the internal structure of your repo. This section describes the relationship between your local repo and a remote repository.

The Remote Repo

The remote repo, often called a remote, is used to collaborate with other people. To do this, a group of people each connect their local repo to the same remote repo.

You push your commits to the remote repo to share them with others. You pull other people's commits from the remote repo to get them into your local repo. Section Git Workflow discusses pushing and pulling in detail.

The working files, the index, and the local repo are all located on your computer under the same directory, but the remote repo usually exists on a Git server, so it can be accessed by multiple people. The remote repo can be in these places:

  • On a server within your company.

  • On a server hosted by an organization such as GitHub, which hosts many thousands of remote repos for some of the biggest projects in the world (e.g., Linux, .NET, etc.). See section GitHub for details.

  • Alternatively, the remote repo can simply be in a directory on a file server, in which case there is no Git server process to handle network communication with Git clients. This may be less efficient than running a Git server within your organization, but it is simpler to manage.

Typically you do day-to-day work in a local repo not in a remote repo. You do this by cloning the remote repo to create your local repo (and its index and working files). Details about cloning are in section Creating a Local Repo.

A single-user project doesn't usually need a remote repo, but many single-user projects have remote repos on GitHub, even though only one person works on them.

How to Refer to a Remote Repo

A remote repo is typically identified by an URL and a short name. The URLs are global. Everyone uses the same URLs. The short name is chosen by you. Only you use the short name.

Here are the URLs for some interesting remote repos:

The short name of a remote is a convenient name chosen by you, the owner of the local repo.

  • The default short name of the remote is origin, but this can be changed at any time.

  • The scope of a short name is a local repo, so different local repos can use the same short names.

Most of the above URLs start with https:, which provides encrypted, authenticated, read/write access to the remote. Some remote repos also provide access via URLs starting with git:, which is neither encrypted nor authenticated. A git: URL gives you anonymous read-only access to the remote. Also, a git: URL uses TCP port 9418 to access the remote, which corporate firewalls may block.

The owner of a remote repo should document the Git URL(s) to use when cloning the repo, both for read-only and read/write access.

The full set of supported URL formats are described in section GIT URLS in the output of git help pull.

Git supports a variety of authentication mechanisms. See the output of git help gitcredentials for details.

GitHub

GitHub has its own documentation, so this section contains just a brief overview.

GitHub is a company that hosts remote repos so that people can collaborate. It is one of the most widely-used remote repo hosting companies, but there are others. It's free for personal use.

GitHub users can fork someone else's remote repo hosted on GitHub to create their own personal copy of the remote. The copy is called a fork of the original remote repo. The fork is also a remote repo hosted on GitHub that can be independently forked. Some GitHub remote repos, such as Linux, have over ten thousand forks!

In this diagram, the remote repos on the left are forked from the remote on the right, and the bidirectional arrows show how commits are copied between the repos:

LOCAL REPOs <-> REMOTE REPO <->\
 LOCAL REPOs <-> REMOTE REPO <->\
  LOCAL REPOs <-> REMOTE REPO <->\
                                  + <-> REMOTE REPO
  LOCAL REPOs <-> REMOTE REPO <->/
 LOCAL REPOs <-> REMOTE REPO <->/
LOCAL REPOs <-> REMOTE REPO <->/

All of the above remote repos are hosted at GitHub.

  • The remote on the right is the official repo and is writable only by its maintainers.

  • The remotes on the left belong to contributors, but they are also hosted at GitHub.

  • The local repos exist on the contributors' own computers.

When a contributor produces a commit of value, they push it to their personal remote (more on this in section Pushing Commits) and submit a GitHub pull request to the maintainers asking them to evaluate the commit. The maintainers can choose to pull the commit into the official repo (more on this in section Pulling/Fetching Commits).

Forking and pull requests are not part of Git. GitHub invented these things as part of their added layer of functionality.

Git Workflow

This section describes how to create and manipulate local repos, create and manage commits, and share commits with others.

Creating a Local Repo

To use Git, you need a repository. There are three ways to create one:

  1. Initialize a existing directory. Do this if you already have the files you want to put under revision control, or if you have no files but plan to create them.

  2. Clone a remote repo. Do this if you need to collaborate with others on files that are already under revision control in a remote repo.

  3. Copy an existing local repo.

Let's look at each method.

Initializing an Existing Directory

You initialize an existing directory by executing command git init in that directory.

  • Command git init creates the .git directory, puts some Git metadata in the .git directory, and leaves you with no tracked files, no working files, an empty index, and an empty local repo.

  • After you initialize a local repo, you are ready to stage and commit new files.

  • If the directory was not empty when you initialized it, the files in it remain as untracked files until you choose to stage some of them.

A newly initialized local repo contains just one branch named master by default. To choose a different name for the master branch, use git init -b BRANCHNAME.

A newly initialized local repo has no associated remote repo, but it can be given one at any time with command git remote add SHORTNAME URL, where SHORTNAME is your chosen short name and URL references an existing remote repo.

But if you know you will be working with a remote repo, it's more likely you will clone that remote repo.

Cloning a Remote Repo

You clone a remote repo with command git clone REMOTEURL, where REMOTEURL identifies a Git remote repo on a server. If the remote repo is just a directory on a file server, REMOTEURL would be the absolute pathname of a file with the extension .git in the root of that directory.

Cloning a remote repo creates a new directory in your current working directory named after the remote repo. If you want to chose your own name for the new directory, use git clone REMOTEURL DIRNAME, where DIRNAME is the name of the directory you want Git to create.

  • After you clone a remote repo, the new local repo contains a copy of every branch in the remote, with the working files and index matching the remote's current branch (usually the master branch).

  • Cloning connects the local repo to the remote repo. This allows you to share work with collaborators who have cloned the same remote repo by pushing and pulling commits to and from the remote.

  • After you clone a remote repo, the remote is called the origin of the local repo. The local repo remembers which remote is its origin.

Copying a Local Repo

A less common way to create a local repo is simply to copy someone else's local repo.

  • This works best if the other person's local repo has no staged files in the index and no modified working files, otherwise your copy will have them too, which is probably not what you want (unless you are taking over their work).

  • This works because there is no per-user information in a local repo. Your personal Git configuration data is stored in file .gitconfig in your home directory (on UNIX/Linux) or in an OS-designated personal folder (on Windows and MacOS).

You can make copies of your own local repos. This allows you do any of the following:

  • Make a backup of your local repo.

  • Have different branches checked out at the same time, one in each copy of your repo, so you don't have to do git checkout BRANCHNAME to switch between branches.

  • Work on multiple computers by having a copy of a local repo on both your personal computer and your work computer. If you change the same file in both repos, you will eventually have to merge those changes (see sections Fetching/Pulling Commits and Merges for details about merging).

  • Migrate a local repo to another computer: just copy it to the other computer, delete the original (eventually), and pick up where you left off.

Committing to the Local Repo

When your staged changes are ready, you commit them to the current branch in the local repo with command git commit. This creates a new commit in the history of current branch.

  • Command git commit launches an editor for you to enter a message that describes your changes in the commit. To change the default editor (usually vi/vim), set Git configuration variable core.editor to the executable name of the editor you wish to use (see section Git Configuration Variables).

  • You can give the commit message on the command line using git commit -m "MESSAGE". The quotes around MESSAGE are necessary if it contains whitespace or shell metacharacters, which is likely.

You can stage and commit individual working files in one operation using command git commit -- FILE1 FILE2 ..., where the specified files are modified working files.

You can stage and commit all modified working files with command git commit -a. Note that if you omit the -a switch, nothing is staged — it just commits all currently staged files.

Staging and committing in one operation is not recommended. The preferred workflow is to stage multiple changes over time, then commit once for a group of related changes.

What's in a Commit?

Each commit contains the following data:

  1. A 160-bit ID (a SHA-1 hash) that uniquely identifies the commit. The hash contains 40 hex digits, but is commonly abbreviated to the leftmost 7 hex digits.

  2. The SHA-1 hash of its parent commit, which is the commit chronologically preceding it in the branch. This is how commits are chained in historical order. Two exceptions to this rule are:

    • The very first commit in a repo has no parent.
    • A merge commit has two parents. Details about merge commits are in section Merges.
  3. A short message from the author describing the commit.

  4. A compressed copy of all tracked files in the branch at the time of the commit.

  5. Miscellaneous metadata, such a the commit time, the author's name and email, etc.

  6. An optional digital signature by the author.

Amending a Commit

Sometimes you make a commit, but then you realize you left a change out of the commit or included a change that you didn't intend to include. This is a good use case for amending a commit with git commit --amend.

When you amend a commit, you replace the latest commit in the current branch with a new commit. This is known as re-writing history, and it is not always appropriate.

For instance, you should not amend a commit if you have already pushed the commit to a remote repo, because your collaborators may already have begun making changes based on your pushed commit (see section Pushing Commits). Amending a pushed commit will confuse them, and likely cause merge conflicts when they next pull from the remote. In this use case, you should simply make a new commit and inform your collaborators.

If you make a commit you want to amend, and you have not yet pushed that commit to a remote, and you have not yet made a newer commit, simply change your working files to have the changes that were missing from the commit, stage them with git add ..., and amend the commit with git commit --amend -m 'MESSAGE'. Alternatively, you can stage and commit in one operation with git commit -a --amend or git commit --amend -- FILE1 FILE2 ....

When you amend a commit, don't even mention the commit being amended in the new commit message, because it will be replaced by the new commit. If the commit message doesn't need to change, add switch --no-edit to keep the commit message from the commit being amended.

Note

You can only amend the latest commit in the current branch. If you make commit A then commit B, but you forget that you made commmit B and try to amend commit A, you will actually be amending commit B, and thus you will lose the changes in commit B. Git cannot detect this situation for you! See section Viewing the Commit History for how to check which commit is the latest in the current branch.

Pushing Commits

A group of people collaborate on a project by pushing commits from their local repos to a common remote repo. Use command git push to push commits from the current branch to the remote repo.

  • A push only copies commits from the current branch that do not yet exist in the remote.

  • A push is how you share your changes with other people. Staging changes to the index (with git add ...) and committing changes to your local repo (with git commit ...) do not share anything, because only you have access to your index and local repo.

  • See the output of git help push for how to push commits from every branch at once, but this is not advised, because it can cause other developers to do merges at unexpected times.

Of course, collaborating with others also requires getting their commits into your local repo. Next, we'll see how to do that.

Fetching/Pulling Commits

To collaborate with others, you need to get their commits from the remote repo. This is done with a fetch or pull operation.

This diagram shows four users (Joe, Sue, Pat, and Bob) each with their own local repos. They collaborate by sharing a common remote. This is a hub-and-spoke topology, where the remote is the hub and local repos are the spokes:

Joe:  WORKING FILES <-> INDEX <-> LOCAL REPO <->\
Sue:  WORKING FILES <-> INDEX <-> LOCAL REPO <-->\
                                                  + <-> REMOTE REPO
Pat:  WORKING FILES <-> INDEX <-> LOCAL REPO <-->/
Bob:  WORKING FILES <-> INDEX <-> LOCAL REPO <->/

The command git fetch will fetch all commits that you do not yet have in the local repo from all branches in the remote. If new branches have been created in the remote, it fetches them too.

  • To fetch just one branch, use git fetch REMOTE BRANCHNAME, where REMOTE is the short name of the remote, and BRANCHNAME is the name of the of the local and remote branch (assuming they have the same name).

  • If the local and remote branch names differ, use git fetch REMOTE LBRANCHNAME:RBRANCHNAME, where LBRANCHNAME is the name of the local branch, and RBRANCHNAME is the name of the remote branch.

By default, git fetch operates silently. To see what is fetched, use git fetch --verbose. To see what would be fetched without actually fetching anything, use git fetch --dry-run.

What fetch Does Not Do

A fetch retrieves new commits from the remote repo, including new branches and the commits in them, but it does not alter your local branches, index, or working files. So you can't immediately see the changes from the commits that were fetched, nor can you see any new branches that were fetched. See section Remote-tracking Branches for information about where git fetch hides the commits it downloads.

  • To see the changes from fetched commits in your current branch, you must do a merge with command git merge. Details about merging are in section Merges.

  • The pair of operations fetch-then-merge is so common that command git pull does both in one operation. A git pull is literally identical to git fetch followed immediately by git merge.

  • Note that git fetch fetches commits in all branches, but git merge only merges changes in the current branch. If you need to merge in other branches, you need to switch to each branch and do a git merge (or git pull) in each one.

  • Keep in mind that you aren't allowed to push to a remote unless you were the most recent person to pull from it. Thus, all merges happen in someone's local repo and never in a remote repo.

Another thing git fetch does not do is create a local branch when there is a new branch in the remote repo. So how do you see new branches after doing a fetch?

  • First, your collaborators should announce when they create new branches in the remote. If they don't, you can do git branch -r to list the remote branch names obtained by the most recent git fetch.

  • Then you can use git switch -c BRANCHNAME REMOTE/BRANCHNAME to create new local branch BRANCHNAME corresponding to the remote branch of the same name (and also switch to the new local branch). Here, REMOTE is the short name of the remote repo.

  • If you want to use a different name for the branch in your local repo than in the remote, do git switch -c LOCALNAME REMOTE/BRANCHNAME, where LOCALNAME is the name of the local branch to create.

Git maintains a shared consensus among all contributors about the contents and shape of the commit graph. Some contributors may not see the most recent commits in a branch if they haven't pulled that branch in a while, but if all contributors were to do git pull in every branch, everyone's branches will contain the same commits.

When to Pull?

When to do a git pull depends on how much stability you need while you work in your repo. Each pull potentially changes the contents of your repo in a way that can interfere with your work.

  • If you work in a personal branch that you have not yet pushed to the remote, then the branch doesn't exist in the remote, so there's nothing to pull, because your collaborators can't push commits to that branch.

  • If you work in a branch that has previously been pushed to the remote, but your collaborators have agreed not to work in that branch, again there's nothing to pull.

  • Lastly, if both you and your collaborators are pushing commits to the same branch in the remote, you must decide when you want to pull their commits into your local branch. Each pull potentially introduces some degree of change (at best) or instability (at worst).

The status Command

There's no doubt that Git is complicated. As the above diagram shows, there are four places in each branch where your files can exist, and the files can differ in each of those places. You may have ...

  • Modified some working files.
  • Staged other changes to the index.
  • Committed other changes to the local repo.
  • Pushed still other commits to the remote repo.

How can you keep track of all this?

The git status command shows you the status of your changes in each of these places. Let's take the case where you have just cloned a remote repo, but you have not made any changes. Your working files, index, local repo, and the remote repo are identical.

In this case, command git status will output this:

$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

This means you have no modified working files, no staged changes in the index, and no un-pushed commits in your local repo.

If you create a new file but do not stage it with git add -- FILE, it is an untracked file. The status command shows untracked files, as follows:

$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        newfile.cpp

If you don't want to see untracked files, use git status -uno. This can be cumbersome if you commonly have untracked files, such as compiler-generated binaries. A better solution is to use a .gitignore file to tell Git to ignore files with certain names. See section Ignored Files for how to configure a .gitignore file.

If you were to modify a working file, such as somefile.c, then git status will show your modified working files:

$ git status
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   somefile.c

no changes added to commit (use "git add" and/or "git commit -a")

If there were multiple modified working files, all of their names would be displayed.

Next, if you were to stage the modified file somefile.c, the output of git status would be:

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   somefile.c

This tells you that you have staged changes in somefile.c that are yet to be committed to your local repo. It also helpfully tells you how to unstage the change.

Next, if you were to commit your staged changes, git status would show this:

$ git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

This tells you that you have one commit in your local repo ready to be pushed to the remote.

Some or all of the above outputs can appear at the same time. For instance, if you have modified working files, uncommitted changes in the index, and un-pushed commits, you would see all three sections: Changes not staged for commit:, Changes to be committed:, and Your branch is ahead of 'origin/master', which is exactly what you want to know.

Important

The git status command does not perform any network operations, so it does not ask the remote repo if there are new commits that you have not yet pulled. The advantage here is that git status works when you are disconnected from the network.

If you want to see if your local repo is missing new commits that exist in the remote, you must do a git fetch to fetch any new commits from the remote — but not merge them into your local repo. If there are new commits in the remote, you'll see this:

$ git fetch
$ git status
On branch master
Your branch is behind 'origin/master' by 5 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)

nothing to commit, working tree clean

One annoying behavior is that when your current working directory is not the root of the repo, you may see some pathnames starting with .., ../.., etc., because git status shows pathnames relative to your current working directory. If configuration variable status.relativePaths is set to false, then all paths will be relative to the root of the repository (see section Git Configuration Variables).

Deleting Untracked Files

Sometimes you want to delete some or all of your untracked files. This is common in software development when you want to delete all compiled binaries to re-build them from scratch.

You delete untracked files with command git clean. This deletes untracked files in the current working directory and in sub-directories containing tracked files (so-called tracked directories).

Beware: if you create a new (untracked) file that you plan to stage but have not yet staged, git clean will delete it. You should stage new files immediately upon creation if you plan to keep them.

However, git clean does not delete any files in untracked directories, which are directories containing only untracked files. To recursively delete untracked directories and the files in them use the -d switch: git clean -d.

Often, it's helpful to see what would be deleted without actually deleting anything. Do this with the -n switch:

$ git clean -n
Would remove temp.txt
Would remove bin/myapp

You can interactively choose what to delete with the -i switch, as follows:

$ git clean -i
Would remove the following items:
  newfeature/temp.txt  other.txt
*** Commands ***
    1: clean                2: filter by pattern    3: select by numbers    4: ask each
    5: quit                 6: help
What now>

Switches -n, -d, and -i can be used together in any combination.

By default, Git configuration variable clean.requireForce is true, which means git clean won't delete anything unless you give at least one of the -n, -i, or -f (force) switches. See section Git Configuration Variables for details about viewing and changing configuration variables.

For safety, git clean will not delete ignored files (see the next section for details about ignored files). To force Git to delete ignored files use git clean -x.

Lastly, if you want to delete everything that is not a tracked file, even ignored files, use git clean -x -d -f.

Ignored Files

Above, you learned that Git usually ignores untracked files, but it knows they exist, and there are times when Git does something unwanted with an untracked file. For instance:

  • Command git clean deletes all untracked files, except those in untracked directories.

  • Command git status shows all untracked files. Although you can use git status -uno to hide untracked files, it's cumbersome to type -uno every time. Perhaps you want to see most untracked files but hide a few that you expect to exist.

  • Command git add * stages multiple untracked files at once due to the wildcard *, and git add subdir stages all untracked files in directory subdir.

If there are certain untracked files you don't want to see with git status, you don't want to delete with git clean, or you don't want to stage with git add ..., you can tell Git to completely ignore certain untracked files.

You do this by creating a text file named .gitignore that specifies a set of patterns (also called ignore patterns). These patterns match the names of files and directories that various Git commands will completely ignore. For example, the pattern *.so tells Git to completely ignore files and directories with names ending with .sogit clean will never delete them, git status will never show them, and git add ... will never stage them.

The .gitignore file is typically located in the root of your repo, but we'll see that you can have .gitignore files in multiple directories within a repo.

By default, ignore patterns are matched against both files and directories, so this document will refer to them collectively as paths.

Precedence of Patterns

Normally, all patterns in a .gitignore file are cumulative. They all take effect together. But some patterns undo the effect of other patterns, such as the pattern !mylib.so, which tells Git not to ignore paths named mylib.so. When two patterns contradict each other, the pattern that occurs later in a .gitignore file takes precedence over the earlier one.

Many Git commands allow you to specify these patterns on the command line. When ignore patterns appear on the command line, they are also cumulative with the patterns in .gitignore files, but if there is a contradiction, the command line patterns take precedence over all .gitignore files.

Multiple .gitignore Files

You can create a .gitignore file in multiple directories within your repo. Git chooses the paths to ignore by reading the .gitignore file in a given directory plus all the .gitignore files above that directory in the hierarchy, all the way to the root of the repo.

  • If there is a contradiction between patterns at different directory levels, the patterns in a lower-level directory take precendence over the patterns in higher-level directories.

  • Thus, you should put more general patterns in .gitignore files in higher-level directories, and put more specific patterns in .gitignore files in sub-directories.

As well, you can set Git configuration variable core.excludesFile to the pathname of a global .gitignore file that applies to all of your repos (see section Git Configuration Variables). Typically, this configuration variable specifies a .gitignore file in your home directory.

It's common to put patterns like *.bak and *~ in your global .gitignore file. These pattens match the backup files created by editors, so that git clean will not delete them in any of your repos.

Ignore patterns that need to be shared with your collaborators should be in a .gitignore file that's committed to the repo, so everyone gets a copy in their local repo.

Important

If you create a .gitignore file in your local repo, but you do not commit it to the repo for some reason, be sure to add the pattern .gitignore to the file, otherwise the next git clean command will delete your untracked .gitignore file.

Ignore Pattern Syntax

The pattern syntax is very flexible but somewhat complex, so it's important to understand the pattern syntax.

First, ignore patterns can contain directory separators, but they always use forward slashes (/), even on Windows. This allows ignore patterns to be shared across operating systems without needing to convert between slashes and backslashes.

Importantly, the presence of a directory separator (/) in a pattern changes the matching behavior of the pattern, as follows:

  • A pattern without any slashes, such as *.tmp, test*.lib, and todo.txt, is matched in the directory containing the .gitignore file and recursively in every sub-directory below it. If this pattern appears on the command line, the recursive matching starts in your current working directory.

  • A pattern with a / at the beginning or in the middle (or both), such as /subdir, subdir/todo.txt, and subdir/*.tmp, is only matched relative to the directory that contains the .gitignore file. This disables recursive pattern matching.

  • A pattern with a trailing /, such as output/, subdir/*tmp/, and build*/, matches only directories. When a directory is ignored, everything under it is also ignored. Whether or not this pattern is matched recursively depends on the presence/absence of slashes at the beginning or in the middle of the pattern (as described above).

  • The absence of a trailing / makes the pattern match both files or directories. There is no way to make a pattern match only a file.

Also, you can mix the various pattern matching syntaxes in a single pattern. For example, the pattern subdir/*.tmp matches any path whose name ends with .tmp in directory subdir in the same directory as the .gitignore file.

If that pattern were to end with a slash (subdir/*.tmp/), it matches any directory whose name ends with .tmp in directory subdir in the same directory as the .gitignore file.

Note

A good way to test that a pattern is correct is to create test files or directories with names that the pattern should ignore, then execute command git clean -n -d to show what untracked files and directories would be deleted (without actually deleting them). If it shows that any of your test files or directories would be deleted, your pattern is not correct.

Example .gitignore File

Here is an example of a .gitignore file with comments describing some common patterns and how Git interprets each one. Please read the comments in this example, as they explain some important rules.

# Lines starting with '#' are comments.  Git ignores these lines as well as blank
# lines.  Each set of patterns below has a comment above it explaining how it works.

# These patterns ignore all paths (either files or directories) with names ending
# in .so and .dll in the same directory as this .gitignore file and recursively in
# any sub-directory under it:

*.so
*.dll

# Use a leading '!' to tell Git NOT to ignore a pattern.  These patterns tell Git
# _not_ to ignore paths with names matching test*.so and test*.dll recursively
# under the directory containing this .gitignore file, even though Git is ignoring
# paths ending with .so and .dll due to the above patterns:

!test*.so
!test*.dll

# This pattern ignores todo.txt only in the directory containing this .gitignore
# file (no recursive matching happens):

/todo.txt

# This pattern ignores all paths under any directory named build in the directory
# containing this .gitignore file and all sub-directories under it:

build/

# NOTE: Once a directory has been ignored, it is not possible use a leading '!' to
# stop ignoring a path under that directory.  Thus, the '!...` syntax never takes
# precedence over ignoring a directory, regardless of the ordering of the
# patterns.

# This pattern ignores paths matching *.txt in directory doc in the same directory
# as this .gitignore file (no recursive matching happens):

doc/*.txt

# Use '/**/' to match an arbitrary number of sub-directories (including zero) in a
# pathname.  This pattern ignores all paths ending with .pdf under the doc/
# directory in the same directory as this `.gitignore` file:

doc/**/*.pdf

# Use a leading '**/' to match a name at any directory depth relative to this
# .gitignore file.  This pattern ignores paths matching *.tmp in any directory
# named doc in any sub-directory under the directory containing this `.gitignore`
# file:

**/doc/*.tmp

# This pattern ignores directories named output in any sub-directory under the
# directory containing this .gitignore file:

**/output/

# NOTE: The pattern `**/*.tmp` is the same as `*.tmp`, because they both match
# `*.tmp` in any sub-directory under the directory containing this `.gitignore`
# file.  But `**/subdir/*.tmp` is not the same as `subdir/*.tmp`, so this pattern
# syntax can be tricky to use correctly.

The full pattern syntax is documented in git help gitignore. See https://github.com/github/gitignore for examples of ignore patterns that are useful with a wide variety of programming languages.

Branches

A repo contains one or more branches. A branch is a named, time-ordered sequence of commits. A commit contains a compressed copy of a branch's files at one point in time.

Every commit in your repo, except the first, contains a pointer to one or more parent commits, forming the historical graph of all commits in all branches. A commit has two (or more) parents when it's the result of a merge of two (or more) branches. Merges are described in detail in section Merges (below).

A branch's name is a reference to the latest commit in the branch. Think of a reference as a label that points to a commit. References are described in detail in section References and Tags.

The below diagram shows a repo with two branches. Branch master references commit C5. It's the current branch and contains six commits (C0 through C5). Branch bugfix references commit C7, and it diverged from the master branch at the common ancestor commit C2:

C0 <- C1 <- C2 <- C3 <- C4 <- C5 <- master (current branch)
            ^
            |
            +-- C6 <- C7 <- bugfix

Note

For readability, I've used the names C0 through C7 to represent commits that are actually identified by 40-digit hexadecimal hash values. This is not a standard Git naming convention.

In the above diagram, arrows point to each commit's parent, forming the chronological history of commits, where time flows from left to right. C0 is the only commit with no parent, because it is the first commit in the repo.

How to Organize Branches

Branches isolate a series of commits from each other. There are many ways to organize branches. Choose the one that works for you. Here are some options:

  • The master branch contains the latest stable version of the code. Unstable versions that are under development (or that have been abandoned) reside in other branches that are merged into the master branch when ready.

  • The master branch contains the latest unstable version of the code. New branches are created from the master branch for the purpose of testing, fixing bugs, and producing a stable release. The bug fixes are merged back into the master branch (unless there's a compelling reason not do so).

  • Bug fix branches are commonly used with both of the above branching models. A bug fix is done in a topic branch, a short-lived branch that is later merged into a long-lived branch.

Types of Branches

There are several types of branches. This section describes each kind of branch.

  • A local branch is a branch in your local repo. You do your day-to-day work in a local branch.

    • Use command git branch to list the local branches in your repo, one per line. The current branch is indicated with an asterisk (*).

    • If your repo has a lot of branches, you can give a wildcard pattern to limit the output to branch names matching the patterm. For example, command git branch --list 'bugfix*' outputs all branch names starting with bugfix. In this case, switch --list must be given.

  • A remote branch is a branch in a remote repo.

    • A remote branch exists for sharing commits with others, as described above in sections Pushing Commits and Fetching/Pulling Commits.

    • You don't do day-to-day work in remote repos, so remote repos typically have no working files.

    • There is no Git command that will show remote branches.

  • A tracking branch is a local branch that has an associated remote branch. The remote branch for a tracking branch is called the upstream branch or simply the upstream.

    • When you push and pull, you synchronize a tracking branch with its upstream branch. Pushes send commits from a tracking branch to its upstream branch. Pulls do the opposite.

    • You can view, clear, and set the upstream branch of a local branch at any time.

      • Use git branch -vv to see the name of the upstream branch for each local branch. This also shows the SHA-1 hash and commit message for the most recent commit in the branch.

      • Use git branch --unset-upstream to remove the upstream branch for a local branch. There is rarely a good reason to do this.

      • Use git branch --set-upstream-to=REMOTE/BRANCHNAME to set the upstream branch to BRANCHNAME in the remote repo named REMOTE. This is not needed if you have cloned the remote repo, because cloning sets all the upstream branches for you.

      • Don't go crazy. If you have a local branch containing the .NET source, and you set its upstream to be a remote branch containing the Linux source, and then you do a pull, Git will happily merge Linux into .NET, creating a mess.

  • A topic branch is a short-lived local branch used to isolate work. It needs no upstream branch, because eventually it will be merged into another local branch.

  • A remote-tracking branch is a read-only branch in your local repo that exactly matches the state of a remote branch. This may seem like unnecessary duplication of data, but remote tracking branches exist for a reason. The next section discusses remote-tracking branches in detail.

Remote-tracking Branches

It's tempting to think that a pull merges an upstream branch directly into the corresponding local branch, but the reality is more subtle, even if the effect is the same.

  • Recall that a pull is a fetch followed by a merge.

  • Fetches are networked, because they communicate with a remote Git server. Merges are always local and non-networked.

You can do git fetch, disconnect from the network, then do git merge, and it works! So where does the fetch put the fetched commits so they are available to be merged later when there is no network?

  • A fetch makes a read-only copy in your local repo of each remote branch. This local read-only copy of a remote branch is a remote-tracking branch.

  • A remote-tracking branch exists in your local repo. It is a local cache of the state of a remote branch at the time of the last fetch.

  • So the answer to the above question is: Fetched commits are put into remote-tracking branches, where they wait to be merged into your local branches.

When you enter command git merge (presumeably after doing git fetch), you omit the name of the branch to merge into the current branch. This tells Git to merge from the remote-tracking branch for the current local branch.

A remote-tracking branch has a name of the form REMOTE/BRANCH, such as origin/master or doc/update1.

  • Your local tracking branch names do not need to match the names of their upstream branch names, but it keeps things simple when you have lots of branches.

  • It's easy to see the remote-tracking branch name origin/bugfix42 and think "That's a branch named bugfix42 in the remote repo named origin", but not so! It's a remote-tracking branch in your local repo named origin/bugfix42 — and it may not be up-to-date with the remote branch.

The purpose of a remote-tracking branch is to give you local read-only access to the state of a remote branch, so you can examine it and merge commits from it. All you can do with it is diff against it, merge from it, and update it by fetching the latest state from the remote.

Creating Branches

You create a new branch with command git branch BRANCHNAME. This creates branch BRANCHNAME so that it references the latest commit in the current branch, but it does not checkout the new branch, so the current branch does not change.

Alternatively, use command git switch -c BRANCHNAME (or the older command git checkout -b BRANCHNAME) to create a new branch and switch to it immediately. This is equivalent to executing the following two commands:

$ git branch BRANCHNAME   # Create the new branch.
$ git switch BRANCHNAME   # Switch to the new branch.

Git refuses to create a new branch with a name that matches an existing branch. All branch names must be unique within a repo. You can override Git's refusal with command git branch --force BRANCHNAME, but this actually deletes the existing branch with that name and then creates the new branch, which is unlikely to be what you intended to do.

In the below example, if the current branch is master, and the most recent commit was C5, then command git branch bugfix creates branch bugfix and causes bugfix and master both to reference commit C5:

                              +- bugfix
                              V
C0 <- C1 <- C2 <- C3 <- C4 <- C5 <- master (current branch)

Restrictions on Branch Names

There are restrictions on the spelling of branch names. These same rules apply to tag names. A branch or tag name ...

  • Must not contain a space, a tilde (~), a caret (^), a colon (:), a question mark (?), an asterisk (*), an open bracket ([), or a backslash (\).
  • Must not contain the sequence .. or @{.
  • Must not start with a dash (-) or a dot (.).
  • Must not end wth a slash (/) or with the sequences .git or .lock.
  • Must not contain multiple consecutive slashses (//, ///, etc.).
  • Must not be the single character @.
  • Must not contain a control character (^A to ^Z) or the delete character.

Also, branch names are case sensitive in Git but case-insensitive in some filesystems, so it's best to avoid branch names that differ only in the case of their characters.

You can use command git check-ref-format --branch BRANCHNAME to check if BRANCHNAME is a valid name for a branch. If it is a valid branch name, this command outputs the specified branch name, otherwise it outputs an error message. In this example, bugfix and release/v2.1 are valid branch names, but release//v2.1 and doc:notes are invalid:

$ git check-ref-format --branch bugfix
bugfix

$ git check-ref-format --branch release/v2.1
release/v2.1

$ git check-ref-format --branch release//v2.1
fatal: 'release//v2.1' is not a valid branch name

$ git check-ref-format --branch doc:notes
fatal: 'doc:notes' is not a valid branch name

Note

Branch names can contain Unicode characters, but you should verify that all your tools that interact with Git support Unicode branch names. It is especially important that your terminal supports Unicode characters. Absent such support for Unicode, it's best to use only single-byte ANSI (ISO Latin-1) characters in branch names, tags, commit messages, and other strings given to Git.

Pushing a New Branch

If you create a new branch in your local repo, Git does not automatically create an upstream branch for it in the remote repo. You cannot simply do git push to push the new branch to the remote. You must switch to the new branch and use the following command:

$ git push -u REMOTE BRANCHNAME

where REMOTE is the short name of the remote and BRANCHNAME is the name of the new local branch. The -u switch tells Git to set the new remote branch as the upstream of the local branch, so you push to it and pull from it in the future.

Slashes in Branch Names

A branch name can contain one or more forward slash characters (e.g., release/v2.1 or release/v2.0/tests), but there is no special significance to the slash character in a branch name. This allows branch names to have a hierarchical structure, which makes it easier to keep track of them.

However, no slash-separated component of a branch name can start with a dot (.), so the names doc/.update1 and .doc/update1 are both invalid.

Switching Between Branches

Let's look at an example of what happens when you switch branches. We'll start with the case shown above, where you have just created branch bugfix.

If you then do git switch bugfix (or the older command git checkout bugfix) and make a new commit (C6), the new commit is part of bugfix instead of master:

                              +- C6 <- bugfix (current branch)
                              V
C0 <- C1 <- C2 <- C3 <- C4 <- C5 <- master

If you then do git switch master and make a new commit (C7), the new commit goes in the master branch:

                              +- C6 <- bugfix
                              V
C0 <- C1 <- C2 <- C3 <- C4 <- C5 <- C7 <- master (current branch)

Above, branch bugfix and branch master diverge at commit C5, the common ancestor. If your repo has more than one branch, every branch shares a common ancestor with some other branch.

Deleting Branches

Branches are not usually deleted, because doing so will destroy commits! But sometimes you want to delete a branch, because it's no longer needed. For instance, you may have created branch bugfix to fix a bug, but later you determined there is no bug. Or you may have created a branch to develop a new product feature, but the planned feature was cancelled.

You delete a branch with command git branch -d BRANCHNAME, but that only deletes the branch's name. Every commit continues to exist as long as it is referenced by a child commit, a branch name, or a tag (a very lightweight label that points to a commit).

Git garbage collects commits that have no references. In the below diagram, if you were to delete branch newfeature, the only reference to commit C52 would be removed, causing commit C52 to be garbage collected. But then commit C51 is unreferenced, so it would also be garbage collected. Remember, deleting a branch destroys commits!

              +- C51 <- C52 <- newfeature
              V
... <- C47 <- C48 <- C49 <- C50 <- master

Note

For safety, Git never garbage collects a commit that's less than two weeks old. This gives you time to change your mind and recover unreferenced commits.

You may have used other revision control systems where the rule was "Nothing is ever really deleted". In those systems, deleting changes or branches simply makes them invisible, but they still exist in the central database. Git is not like that. Git lets you permanently delete data!

When delete a branch in your local repo, you should also delete the branch in the remote with git push REMOTE --delete BRANCHNAME, where REMOTE is the name of the remote repo.

As well, Git does not automatically delete the branch in your collaborators' local repos when they next do git fetch or git pull. Your collaborators must prune the deleted remote-tracking branch and then delete the local branch with this pair of commands:

$ git fetch --prune         # Prune the remote-tracking branch.
$ git branch -d BRANCHNAME  # Delete the local branch.

Thus, deleting a branch makes work for your collaborators. Plus, it's work they don't know they need to do unless you tell them you deleted a branch.

Given the previous point and given that deleting a branch (eventually) deletes the commits it contains, many organizations forbid deleting branches.

Renaming a Branch

You can rename a branch with git branch --move OLDNAME NEWNAME, where OLDNAME is the name of an existing branch, and NEWNAME is the new name for the branch. After this command, NEWNAME references the same commit that OLDNAME previously referenced.

When you rename a branch, Git does not automatically rename the branch in the remote repo. You must do that manually by executing these commands with the renamed branch as the current branch:

$ git push REMOTE --delete OLDNAME   # Delete branch OLDNAME in the remote repo.
$ git push -u REMOTE NEWNAME         # Push branch NEWNAME to the remote.

The above commands delete branch OLDNAME in the specified remote repo and push branch NEWNAME to the remote. Switch -u tells Git to set remote branch NEWNAME as the upstream branch for the current local branch, which then becomes a tracking branch.

If you omit the -u switch, the remote branch is still renamed from OLDNAME to NEWNAME, but it does not become the upstream for the current local branch, so you cannot push to it and fetch/pull from it. If you make this mistake, the fix is to do: git branch --set-upstream-to=REMOTE/NEWNAME.

After you do this, your collaborators also have work to do to obtain your changes. They must do the following:

$ git fetch --prune                       # Update the set of remote-tracking branches.
$ git branch -d OLDNAME                   # Delete renamed local branch OLDNAME.
$ git checkout -b NEWNAME ORIGIN/NEWNAME  # Create new local branch NEWNAME with upstream ORIGIN/NEWNAME.

As with deleting a branch, renaming a branch makes work for you collaborators, and it's work they don't know they have to do unless you tell them. For this reason, many organizations forbid renaming branches.

Merges

A branch can be merged into any other branch. This combines all the changes in the merged branches into a single branch. There are two general situations where merging happens:

  1. A remote-tracking branch is merged into its corresponding local branch. This happens when you do git fetch followed by git merge (or equivalently git pull). See section Remote-tracking Branches for details about remote-tracking branches.

  2. One or more of your local branches are merged into another local branch. This is common when you are done working on a branch and need to merge it into a longer-lived branch, such as when you fix a bug in a topic branch.

In the first case above, git merge takes no arguments, which tells Git to merge into the current branch from the corresponding remote-tracking branch. This is only possible when the current branch has a remote-tracking branch, which is only the case when the current branch has an upstream branch in the remote repo.

In the second case above, git merge ... takes one or more arguments, which are the names of the branches to merge into the current branch.

How a Merge Works

When you merge one branch into another, the result is a merge commit in the target branch. The merge commit contains the changes from both branches. Here, branch bugfix has been merged into branch master, creating merge commit C8:

                  +---- C5 <- C6 <--+ <- bugfix
                  V                 |
C0 <- C1 <- C2 <- C3 <- C4 <- C7 <- C8 <- master

A merge commit always has multiple parent commits, one from each of the branches that were merged. Above, the parents of merge commit C8 are commits C6 and C7.

In the vast majority of cases, only two branches are merged, but Git allows more than two branches to be merged. This is called an octopus merge. See section Octopus Merges for details.

Important

Don't be confused by the directions of the arrows above. The arrows point to the parent commit(s) of a given commit, but the bugfix branch was merged into the master branch in merge commit C8.

A merge combines the changes from all merged branches into a single merge commit.

  • If the branches being merged contain changes to the same file, the version of that file in the merge commit contains the changes from all of the branches.

  • If the changes from the merged branches don't work together correctly, it's your responsibility to figure that out.

  • In this example, a good way to mitigate this problem is to first merge the master branch into the bugfix branch, test it (and fix any problems in the bugfix branch), when merge the bugfix branch into master branch.

After a merge, all branches can continue to accumulate commits. Later, they can be merged again. Here, a second merge has created merge commit C13:

                  +---- C5 <- C6 <--+ <-- C10 <- C11 <-+ <- bugfix
                  V                 |                  |
C0 <- C1 <- C2 <- C3 <- C4 <- C7 <- C8 <- C9 <- C12 <- C13 <- C14 <- master

The merge Command

To merge another branch into the current branch, use command git merge BRANCHNAME, where BRANCHNAME is the name of the other branch. For example, the merges shown in the diagram above were done with git merge bugfix (assuming the current branch was master).

Git refuses to do a merge if you have changes staged in the index. You must either commit, un-stage, or stash those changes before doing a merge.

Git will allow a merge if the working files have changes, but this is not recommended, because you may lose some of those changes if merge conflicts happen. Merge conflicts are described in the next section.

If the merge is successful, the git merge ... command is all you need to do. Otherwise, the merge will create merge conflicts that you must resolve. This is described below in section Merge Conflicts.

Fast-forward Merges

If you are merging another branch into the current branch, and you have made no changes to the current branch since the other branch diverged, Git does a fast-forward merge, because there is no actual merging work to do. A fast-forward merge can also happen due to git pull.

For example, given this branch structure, where branch master is the current branch:

                  +- C4 <- C5 <- bugfix
                  V 
C0 <- C1 <- C2 <- C3 <- master (current branch)

Then command git merge bugfix causes a fast-forward merge resulting in this commit graph:

                              +-- bugfix
                              V
C0 <- C1 <- C2 <- C3 <- C4 <- C5 <- master

It's called a fast-forward merge because the branch reference master is forwarded along the graph until it references same commit as bugfix. No actual merging work needs to happen.

Merge Conflicts

A merge may cause merge conflicts in your working files, which are places where the same lines were changed in different branches, so Git cannot know which lines to use in the merge. If there are no merge conflicts, you have nothing to do after running git merge ..., otherwise Git well tell you that you must resolve the conflicts (as described in section Resolving Merge Conflicts below).

Merge conflicts are marked with special prefixes around the conflicting lines (<<<<<<<, =======, and >>>>>>>), like this:

 These lines were either unchanged from the common ancestor,
 or cleanly resolved because only one branch changed.

 <<<<<<< yours:somefile.c
 These lines were changed one way
 in one branch.
 =======
 These lines were changed differently
 in the other branch.
 >>>>>>> theirs:somefile.c

 This is another line that was either cleanly resolved or unmodified.

The conflicting changes appear between <<<<<<< and >>>>>>>.

The lines before the ======= are from the current branch, and the lines after it are from the other branch.

You can change the conflict marker style to show a 3-way diff (see git help merge). In that case, a conflict appears like this (note the new marker |||||||):

 These lines were either unchanged from the common ancestor
 or cleanly merged because only one branch changed.

 <<<<<<< yours:somefile.c
 These lines were changed one way
 in one branch.
 |||||||
 These lines were the original lines
 from the common ancestor of the two branches.
 =======
 These lines were changed differently
 in the other branch.
 >>>>>>> theirs:somefile.c

 This is another line that was either cleanly merged or unmodified.

The lines after the ||||||| are the original line from the common ancestor of the branches being merged.

Resolving Merge Conflicts

To resolve merge conflicts, you must manually edit the conflicts to remove the conflict markers and choose one of the conflicting set of lines — or manually create a mixture of the two.

After fixing merge conflicts (including removing the conflict marker lines), you must resolve the conflicted file(s) by staging the file(s) with git add -- FILE1 FILE2 .... If you wish, you can stage the resolved files one at a time. The important thing is that all files with conflicts get resolved.

After resolving the conflicts, you complete the merge with command git merge --continue, which commits the staged files in a single merge commit. This is a use case where you must accumulate changes (the resolved merge conflicts) in the index before making a single commit to finish the merge.

After encountering merge conflicts, if you choose not to continue with the merge, use git merge --abort to abort the merge, which leaves the index and working files in their pre-merge state.

Caution

Git does not check that you have removed all the merge conflict markers when you do git merge --continue. Failing to remove the conflict markers can cause build errors in source code and will likely confuse your collaborators, who will see the conflict markers.

Octopus Merges

An octopus merge is a merge of more than two branches, which is not a common use case. To merge more than two branches, simply specify all of the branches to merge into the current branch, like this:

$ git merge branch1 branch2 branch3

The above example creates a single merge commit with four parents (the current branch plus the three specified branches). You can specify as many branches as you want.

In practice, octopus merges are rare, because ...

  • They can be complex to resolve if there are conflicts.
  • They can be harder to understand when viewing history.
  • Most merge scenarios can be handled with a series of regular two-parent merges.

References and Tags

References allow you to use human-readable names instead of SHA-1 hashes to refer to commits. A reference is a symbolic name that points to a commit.

  • A branch name is simply a reference that points to the most recent commit in that branch.

  • A tag is a reference to an arbitrary commit to mark historic milestones, such as a particular release version of a product. See section Using Tags for how to create and view tags.

The command git show-ref shows the references in your local repo:

$ git show-ref
34897afde27cc38baf349ec59d16848935181ee6 refs/heads/master
fee6f1a6ae348935181ee6bfec59d1682bf6a22a refs/remotes/origin/HEAD
fee6f1a6ae348935181ee6bfec59d1682bf6a22a refs/remotes/origin/master
b4f7a8752c312e934bd122b2a3b5d1a81af93764 refs/tags/qa-release-1

Above, the 40-digit hex values are SHA-1 hashes identifying individual commits. The corresponding human-readable references are on the right.

Reference names resemble relative pathnames, and they have a hierarchical structure.

  • Local branches are under refs/heads.

  • Tags are under refs/tags.

  • References to commits in a remote repo are under refs/remotes. Here, origin is the short name of a remote.

The HEAD Reference

There is a special reference named HEAD. The HEAD reference always points to the name of the current branch, so it's actually a reference to a reference.

When you make a commit, the new commit is always added to the branch referenced by HEAD (the current branch).

In the below example, commit C3 is added to branch master, which is the current branch. Reference master changes to point to C3, but reference HEAD continues to point to master, so it now (indirectly) references commit C3 instead of C2:

Before: C1 <- C2 <- master <- HEAD

After:  C1 <- C2 <- C3 <- master <- HEAD

Here is how HEAD changes when you switch from branch master to branch bugfix using command git checkout bugfix:

Before: ... <- C10 <- C11 <- master <- HEAD
        ... <- C26 <- C27 <- bugfix

After:  ... <- C10 <- C11 <- master
        ... <- C26 <- C27 <- bugfix <- HEAD

Oddly, command git show-ref (described in the previous section) does not show you the HEAD reference in your local repo. It only shows the HEAD reference in the remote repo. To see your local HEAD reference use command git symbolic-ref, which shows references that point to other references (instead of to commits).

To see which branch is current, use git status. In this output, the current branch is master:

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

Note

The notation origin/master connects the short name of a remote (origin) to the name of a branch (master) with a / between them. This is called a remote-tracking branch. Remote tracking branches are discussed in section Remote-tracking Branches.

Detached HEAD

  • UNDER CONSTRUCTION

Using Tags

A tag is a human-readable name that references a commit. You create tags to mark important points in the history of a branch, such as a product release, a bug fix, or some significant milestone.

Tag names must follow the same rules as branch names (see section Restrictions on Branch Names).

There are two kinds of tags: lightweight tags and annotated tags. Annotated tags contain more information than lightweight tags, as described below.

Creating Lightweight Tags

A lightweight tag is simply a reference to a commit. You can create a lightweight tag that references the latest commit in the current branch with command git tag TAGNAME, where TAGNAME is the name of the tag. For example, git tag release3.0 creates a lightweight tag release3.0.

To make a lightweight tag reference a commit other than the latest commit in the current branch, use git tag TAGNAME COMMIT, where COMMIT is any valid reference to a commit (see section Revision Specification Syntax).

There's nothing else to a lightweight tag. It's just a name and the commit it references. Annotated tags have more to them.

Creating Annotated Tags

An annotated tag is also a reference to a commit, but it has the following additional attributes that a lightweight tag does not have:

  • A creation date and time.
  • The name and email address of the person who created the tag.
  • A message describing the tag.
  • An optional GPG (GNU Privacy Guard) digital signature by the creator.

To create an annotated tag (without a digital signature) that references the latest commit in the current branch, use command git tag -a TAGNAME. To create an annotated tag that references a commit other than the latest one in the current branch, use git tag -a TAGNAME COMMIT.

An annotated tag requires a message. If you don't provide one on the command line with switch -m 'MESSAGE', Git launches an editor for you to enter a message, just as when you make a new commit. In fact, if you give switch -m you can omit switch -a, because the presence of a message implies you are creating an annotated tag.

To create a tag that is digitally signed, you need to have a GPG key pair. For instructions on how to set up your GPG key pair, see chapter Git Tools - Signing Your Work in the Git Pro book. The short version is that you can create a key pair and configure Git to use it with these commands, which requires that you have GPG installed (from https://gnupg.org):

$ gpg --gen-key
...
$ git config --global user.signingkey KEYID

where KEYID is the key ID of your GPG key pair. You can see your GPG keys (and their IDs) with command gpg --list-keys.

Once you have a GPG key pair, you can create digitally signed tags using git tag -s TAGNAME. If you have more than one key pair, you can select which one to use with git tag -u KEYID TAGNAME, where KEYID identifies the key pair to use. The presence of the -u switch implies the -s switch, so you don't have to specify both.

You can verify the digital signature on a tag created by someone else using git tag -v TAGNAME.

Pushing Tags to the Remote Repo

Tags are not pushed to the remote repo when you push commits with git push. You must explicitly push a tag to the remote repo with command git push REMOTE TAGNAME, where REMOTE is the name of the remote repo and TAGNAME is the name of the tag. This allows you to have personal tags in your local repo that are not visible to your collaborators.

Alternatively, you can push all tags (along with the commits in the current branch) with command git push --tags REMOTE, where REMOTE is the short name of the remote. However, this is not recommended, because it pushes your personal tags as well. It's better to push only the tags you want to share with your collaborators.

Replacing Tags

Git won't allow you to create a tag if there's already one with the same name. You can force Git to create a tag with the same name as an existing tag with the -f switch, but this will cause the existing tag to be deleted and replaced with the new tag.

A big problem with replacing or deleting a tag is that your collaborators still see the old tag. When you replace (or delete) a tag, Git does not change the tag in your collaborator's local repos. The next section describes how your collaborators can update their tags when you replace or delete a tag.

Note

Making your collaborators do extra work because you replaced or deleted a tag is not polite. Many organizations forbid doing so. If you created a tag incorrectly, and you have already pushed the tag to the remote, it's best to just create a new tag. Only when the tag has not yet been pushed to the remote is it acceptable to replace or delete a tag.

Deleting Tags

Lastly, you can delete a tag with git tag -d TAGNAME, where TAGNAME is the name of the tag to delete.

  • As with replacing an existing tag (above), this does not delete the tag from the remote or from your collaborators' local repos.

  • To delete the tag from the remote, do git push REMOTE --delete TAGNAME, where REMOTE is the name of the remote repo.

Your collaborators will continue to see the tag until they prune their tags during a fetch with command git fetch --prune-tags. This removes any tags that no longer exist in the remote repo. Of course, they won't know to do this unless you tell them.

Displaying Tags

You can see all tags in your local repo with command git tag or git tag -l. Switch -l is implied when no arguments are given to the tag command. This shows the names of all tags, one per line, in alphabetical order. This does not show which commits the tags reference.

To list all tags along with the commits they reference, use switch --format like this:

$ git tag -l --format='%(refname:short) %(objectname)'

See the output of git help tag for details about the format specification syntax. This syntax is hard to remember, so you may want to create a Git alias to run this command (see section Git Aliases for how to create Git aliases).

You can limit which tags are displayed using switch -l PATTERN, where PATTERN is a wildcard pattern. For example, git tag -l 'release2*' shows all tag names starting with release2. Another example is git tag -l 'qa-*-2025', which shows all tag names starting with qa- and ending with -2025. You can specify more than one * in the pattern.

Note

The patterns are the same as file name wildcard patterns in UNIX/Linux shells, so you can also use ? to match any single character and [a-z] to match a range of characters. Quotes are needed around the patterns, because you want your shell to ignore the wildcard pattern characters so that Git will see them.

To see the details of a specific tag, use git show -s TAGNAME, where switch -s suppresses displaying the differences for the changes in the commit referenced by TAGNAME. It displays output similar to the following:

$ git show -s bugfix-42
commit ca82a6dff817ec66f44342007202690a93763949
Author: Joe Smith <[email protected]>
Date:   Mon Mar 17 11:52:03 2025 -0500

    Fixed bug #42.

The above example shows the details of the commit referenced by a lightweight tag. The output is similar for an annotated tag, but it also shows the tagger's name and email address, the date/time the tag was created, and the tag message.

Reflogs

  • UNDER CONSTRUCTION

Viewing Commit History and Differences

This section describes how to view the commit history of your repo and how to view differences between commits.

Viewing the Commit History

Use command git log to show the history of commits in the current branch, plus those in any branches that were merged into the current branch. Its output looks like this, showing each commit's hash, author, date/time, and message:

$ git log
commit c61e1dca02487974438af55cb61ee5ab01a0e8d1 (HEAD -> master, origin/master, origin/HEAD)
Author: Joe Smith <[email protected]>
Date:   Mon Mar 17 19:45:06 2025 -0400

    Fixed bug #213 by eliminating a race condition.

commit c9b5966c29b5654d3f7b38ea4d5fbe0e1079d279
Author: Sue Jones <[email protected]>
Date:   Sat Feb 15 13:09:22 2025 -0400

    Added 'Help' button to main screen.

commit e22d5fea417934e983d005a5aa81307efe34a4cb
Author: Joe Smith <[email protected]>
Date:   Thu Jan 9 10:02:13 2025 -0400

    Initial commit.

Above, the current branch's commit history contains three commits by two different authors. Note that the last commit shown is the first one in the branch. It's also the first one in the repo, since this is the master branch. The commits are shown in reverse chronological order (newest to oldest), but this ordering can be reversed using git log --reverse.

You can use git show COMMIT to see the details of a specific commit along with a diff of the changes in that commit, where COMMIT is the SHA-1 hash of the commit (or its first few digits) or any valid commit reference syntax (see section Revision Specification Syntax). Use git show -s COMMIT to suppress the diff part of the output, which can be large.

There are various ways to limit the output of git log so that you don't see the entire commit history. The next sections describe how to do this.

Limiting History Output by Commit References

The git log command takes optional arguments specifying one or more commits. When these arguments are present, git log ... shows only the specified commits and all commits reachable by following the parent links from the specified commits. Thus, it shows all ancestors of the specified commits.

You can specify commits in a variety of ways. For details, see section Revision Specification Syntax and the output of command git help revisions. The most common ways to specify a commit are:

  • Using a branch name. For example: git log bugfix42. Omitting all arguments is equivalent to git log HEAD, which shows the commit history of the current branch.

  • Using a tag name. For example: git log release2.3.

  • Using the SHA-1 hash of the commit. For example: git log 81c4b. The more hex digits you specify, the more likely you are to identify the correct commit.

Keep in mind that git log ... only shows the specified commits and their ancestors, so you will not see commits made after the specified commits.

Limiting History Output by Number of Commits

You can use git log -n NUMBER to limit the output to the specified number of commits. The -n switch can be mixed with commit specifiers. For instance, git log -n 3 release2.1 shows the last 3 commits in the branch named release2.1.

use git log --skip=NUMBER to skip the first NUMBER commits and show the rest. You can mix this with -n. For instance, to see the last ten commits in the current branch, skipping the first two, use git log -n 10 --skip=2.

Limiting History Output by Date and Time

You can limit the output based on a date and time range using switches --since=... and --until=..., as follows:

  • Use git log --since='3 days ago' to see commmits in the last three days. The quotes are needed because 3 days ago contains whitespace.

  • Use git log --until='last monday at 11:15 am' to see commits before last Monday at 11:15 AM.

  • You can mix --since and --until like this: git log --since='1 month ago' --until='2 days ago at noon'.

  • To be very specific, you can use --since and --until with a date in the format YYYY-MM-DD or a date and time in the format YYYY-MM-DD HH:MM:SS.

  • You can restrict the history to just commits by one or more authors with --author=REGEX, where REGEX is a regular expression matching an author's name or email address. For example:

    • git log --author='Bill Smith' shows only commits by Bill Smith.

    • git log --author=Smith shows all commits by anyone with Smith in their name or email address.

    • git log --author='@microsoft\.com' shows only commits from people with a Microsoft email address. The . is escaped with \ because the string after the = is a regular expression, and an un-escaped . will match any character, not just a dot.

    • git log --author='@.*\.org' shows only commits from people with email addresses ending in .org.

    • Use switch -E to enable extended regular expressions, which allows git log -E --author='(Bill|Sue) Smith' to see only commits by Bill or Sue Smith.

See the output of git help log for the full set of switches to limit history output.

Using ^ to Omit Commits

You can prefix one or more arguments with ^ to omit those commits (and all ancestors of those commits) from the displayed commit history.

First, Git builds the set of commits reachable from the arguments without the ^ prefix. Then it builds the set of commits reachable from the arguments with the ^ prefix. Then it subtracts the second set from the first and shows the result.

  • For example, command git log HEAD ^release1.2 shows all commits in the current branch (i.e., reachable from HEAD) except for those reachable from tag release1.2, which presumably is a tag on a commit in the current branch. This will show all commits from the one after release1.2 to HEAD. In this use case, you must specify HEAD, even though it is the default when no arguments are given.

  • As another example, given the commit history shown above, command git log HEAD ^e22d5f shows only the two newest commits in the current branch. The third commit and all commits reachable from it (had there been any) are removed from the output.

  • A shorthand for this syntax is git log COMMIT1..COMMIT2 which is equivalent to git log ^COMMIT2 COMMMIT1. Note the reversed order of the commit specifiers with the .. syntax, which makes sense if you think of this command as showing all commits between COMMIT1 and COMMIT2.

Displaying the Commit Graph

Use switch --graph to see the commit graph using a text-based graphical display to represent the branches. This helps visualize branches and merges. Here's an example of the output of git log --graph (the elipses at the top and bottom indicate this is only a subset of the full commit graph):

...
|
* commit 65dcec567d59d78e7b8c94792f777568163a3612
| Author: Joe Smith <[email protected]>
| Date:   Fri Mar 7 18:17:49 2025 -0500
|
|     Fixed bug with saving backups.
|
*   commit fcf3989a099490dc667842639b4a396a8a01f218
|\  Merge: 846032b 77de8cc
| | Author: Joe Smith <[email protected]>
| | Date:   Fri Mar 7 16:57:04 2025 -0500
| |
| |     Merge branch 'bugfix' of github.com:/jsmith/project1
| |
| * commit 77de8cc6a54acef8cc87d615e5af4cf69d60c740
| | Author: Joe Smith <[email protected]>
| | Date:   Thu Mar 6 16:43:59 2025 -0500
| |
| |     Changed width of main window.
| |
* | commit 846032b1cb85e7e87a22d7ab237e9c441ba0bcbd
|/  Author: Joe Smith <[email protected]>
|   Date:   Thu Mar 6 13:49:06 2025 -0500
|
|       Improvements to save dialog.
|
* commit 9c74297f776518363a1662d5ec5c765dd9877e8b
| Author: Joe Smith <[email protected]>
| Date:   Tue Mar 4 09:33:01 2025 -0500
|
|     Fixed a typo in an error message.
...

The asterisks (*) specify commits in the graph and the vertical lines show the branches, merges, and common ancestors. In most terminals, the vertical lines and asterisks are colored differently to help identify the branches.

  • Above, the second commit is a merge commit. The third and fourth commits exist in the branches that were merged.

  • The merge commit contains the line Merge: 846032b 77de8cc, which identifies the two branches that were merged.

  • The common ancestor of the two branches is the bottommost commit (9c74297). Note that the branch on the right diverges from the branch on the left on the line below the asterisk for commit 846032b.

Viewing Differences

You can view differences (also called diffs) between versions of your files with command git diff. By default, git diff shows the differences between each working file and its staged version in the index.

  • To see the differences between your staged files and the latest committed versions in the local repo, use git diff --staged or git diff --cached, which are synonymous.

  • To see the diferences between your working files and the latest committed versions in the local repo, use git diff HEAD.

The output of git diff can be very long, so the output is piped to a pager application, such as more or less on UNIX/Linux. When the pager displays the diff, press the space key to advance to the next screen of output, press b or backspace to go to the previous screen, and press q to quit the pager. Press ? to see interactive help for the pager.

Here's an example of the output of git diff showing the differences in a single file:

diff --git a/myapp.cpp b/myapp.cpp
index 5a24214..31d8b01 100644
--- a/myapp.cpp
+++ b/myapp.cpp
@@ -405,7 +405,8 @@
     // Terminate the application.

     // First, close all files.
-    closeFiles();
+    if (!closeFiles())
+        returnValue = 42;

     // Next, free allocated memory.
     freeAllMemory();

The first line above indicates that this is the output of Git's diff command for file myapp.cpp. The pseudo-directory names a/ and b/ indicate these are two versions of the same file: before and after the changes shown in the diff. These are not the names of actual directories in your repo.

Next, the line index 5a24214..31d8b01 100644 contains metadata about the file being compared:

  • 5a24214 is the abbreviated SHA-1 hash of the source version (before the changes).
  • 31d8b01 is the abbreviated SHA-1 hash of the target version (after the changes).
  • 100644 is the file mode, which represents the file permissions in octal format.

Next, the lines starting with --- and +++ indicate the marker characters that show which lines were removed (---) from the version before the changes and which lines were added (+++) in the version after the changes.

Next, the line @@ -405,7 +405,8 @@ summarizes the range of lines displayed. In this case, the diff shows that the 7 lines starting at line 405 changed to the 8 lines starting at the same line. The second line number can be different from the first line number in the case where earlier changes added or removed lines.

Lastly, the diff shows the lines that were changed with a leading - indicating lines before the change and a leading + indicating lines after the change. For context, three unmodified lines are shown above and below each change. Use switch -U to change the number of context lines. For instance, git diff -U8 shows 8 lines of context instead of 3.

If other lines were changed in the file, additional blocks of diff output follow the one shown above, each starting with a header of the form @@ ... @@. If there are additional files in the diff output, each file's diffs start with the full header shown above.

Note

The terse header lines at the top of each diff block help Git and other programs apply patches programmatically, because patches are commonly represented as diffs. As a human reader, you mostly care about the file names, the line numbers, and the changed lines.

Limiting diff Output to Certain Files

If you want see the differences for just some files or directories instead of all files in the repo, use git diff -- FILE-OR-DIR .... You can specify multiple files and directories.

Above, you saw that git diff HEAD shows the differences between your working files and the latest committed versions of those files in the current branch. Any branch reference can be specified in place of HEAD. For example, to see the differences between your working files and the latest versions of those files in branch bugfix42, use git diff bugfix42.

Lastly, you can see the difference between two arbitrary commits with git diff COMMIT1 COMMIT2, where COMMIT1 and COMMIT2 are commit hashes, branch names, tag names, or any of the revision specifiers described in section Revision Specification Syntax. For example, to see the differences between the latest committed files in two branches, use git diff BRANCH1 BRANCH2.

Note

Typically, you want to see differences with respect to the latest commit in a branch, but Git doesn't require that. If you specify a commit in the middle of a branch's history, that works too.

In all diff commands, you can append file or directory pathnames to restrict the diff output to just those files and directories.

See git help diff for additional switches and the full set of command syntaxes.

Advanced Concepts

This section describes advanced Git concepts and commands. Read section Basic Concepts first. This section is more concise than the above sections, as it is targetted to advanced users.

Revision Specification Syntax

Many Git commands accept a reference to a commit as an argument. For instance:

  • You create a tag for a specific commit using git tag TAGNAME COMMIT.
  • You see the differences between two commits with git diff COMMIT1 COMMIT2.
  • You see the change log for a range of commits with git log COMMIT1..COMMIT2.

Commits can be specified using a SHA-1 hash (or its first few digits), a tag name, a branch name (which refers to the latest commit in that branch), or the special reference HEAD (which refers both to the current branch and to the latest commit in it).

However, the full syntax for referencing commits is much more powerful. This section describes some common advanced syntaxes for referencing commits.

See the output of git help gitrevisions for the complete revision specification syntax. In the Git documentation, commits are often called revisions.

Note

In the below list, the notations REFNAME and REV mean the following:

REFNAME is any symbolic reference, such as a branch name, a tag name, or HEAD.
REV is a SHA-1 hash (or its first few digits), a branch name, a tag name, or HEAD.

Here are some of the most common advanced syntaxes for referencing commits:

  • @ is a shorthand for HEAD. For example, git log @ is the same as git log HEAD.

  • REV~ refers to the first parent of the commit referenced by REV. Most commits have just one parent, but merge commits have two (or more) parents. For example, git log -n 1 HEAD~ shows the commit that is the first parent of the latest commit in the current branch, and git diff b37015~..b37015 shows the difference between the first parent of commit b37015 and that commit.

  • REV~N refers to the Nth ancestor reached by following only the first parents starting with the commit referenced by REV. Thus, REV~1 is the same as REV~, and REV~2 refers to the first parent of the first parent of the commit referenced by REV.

  • REV^ refers to the first parent of the commit referenced by REV. This is the same as REV~, except when a number follows (see the next item).

  • REV^N refers to the Nth parent of the commit referenced by REV. If there is no Nth parent, Git reports an error. If the commit has just one parent, N can only be 1, but if it's a merge commit, N can be 1 or 2 — or even greater than 2 for an octopus merge commit (see section Octopus Merges).

  • REFNAME@{N} refers to the Nth past value of REFNAME. For example, git log master@{3} shows the commit history starting with the 3rd most recent value of master. This requires that the specified REFNAME has a reflog that includes the needed historical information. See section Reflogs for details about reflogs.

    NOTE: The syntax REFNAME@{N} does not follow the parent pointers in the commit graph. It uses the reflog of REFNAME to access its Nth historical value.

  • REFNAME@{DATETIME} refers to the value of REFNAME at an earlier point in time. For example: master@{yesterday}, HEAD@{5 minutes ago}, master@{last week}, and bugfix@{2025-02-26 18:30:00}. This syntax is not typically used with a tag name, because tags usually don't change over time. It is most useful with branch names, because branch references change as new commits accumulate in the branch.

  • When REFNAME is omitted to the left of @ it defaults to HEAD. For example, @{N} is the same as HEAD@{N}.

These syntaxes can be composed with each other. For instance:

  • @^ refers to the first parent of the latest commit in the current branch. This is the same as HEAD^.

  • @^^ refers to the first parent of the first parent of the latest commit in the current branch. This is the same as HEAD^^.

  • REV~2 is the same as REV~1~1 and REV~~.

  • REV^2~3 refers to the third ancestor commit (following just first parents) before the second parent of the commit referenced by REV.

If you are not sure which commit is referenced by a given revision specification, use command git rev-parse REVISION to see the SHA-1 hash of the commit referenced by REVISION. For example:

$ git rev-parse HEAD~2
aa2a761b8a506eb74b5cc368514592144dca3f74

Then you can git show -s aa2a761 or git log -n 1 aa2a761 to see the details of that commit.

Reverting Changes

Sometimes you modify working files, stage changes, or commit changes, but then you change your mind. This section describes how to undo changes.

The are two general ways to revert changes in Git: the old way and the new way.

  1. The old way uses commands git checkout ... and git reset ..., which have existed since the beginning.

  2. The new way uses commands git revert ... and git restore .... These were added later as simpler alternatives to the older commands.

It's important to understand both ways to revert changes, so this document will describe both, starting with the new commands.

Warning

The Git commands to revert changes will permanently delete your changes! Be careful.

Reverting Changes the New Way

For the majority of use cases, changes can be reverted using commands revert and restore. Here's what these commands do:

  • Command git revert ... creates one or more new commits to undo the changes of one or more earlier commits.

  • Command git restore ... reverts changes to the working files and/or the index.

Let's look at each one.

The revert Command

The command git revert ... creates new commits that undo the changes in existing commits. This has two big advantages:

  1. It preserves the entire commit history, because it does not destroy commits.

  2. It's less likely to confuse your collaborators by destroying commits they have already pulled into their local repos.

To revert a single commit, use git revert COMMIT, where COMMIT is any valid reference to a commit (see section Revision Specification Syntax for details). Git will create a new commit that undoes the changes in the specified commit. Since a commit can contain changes to more than one file, this can revert changes to multiple files.

As with any commit, Git will spawn an editor to allow you to enter a modified commit message. There is no switch that lets you provide the commit message on the command line, but you can re-use the existing commit's message with git revert --no-edit COMMIT.

When you use git revert ..., your working files and index must be clean, meaning they must match the tip of the current branch, otherwise the command will fail. If you have changes in your working files or the index, you must either commit them or stash them before using the revert command.

If you are not reverting the most recent commit in the branch, there's a risk that the changes generated by Git will cause conflicts with more recent changes.

  • If this happens, the operation will stop before generating the new commit and issue a warning.

  • You must then fix the conflicts, use git add FILE1 FILE2 ... to resolve the conflicts, and finally use git revert --continue to tell Git to finish generating the new commit.

  • If you decide the conflicts cannot or should not be fixed, use git revert --abort to abort the revert operation.

A useful switch is --no-commit. Command git revert --no-commit COMMIT changes your working files and index but does not make a new commit. You can review and modify the changes, then manually make the commit with git commit. For example, if your repo contains source code, this gives you a chance to test the changes before committing them.

You can revert multiple commits at once using git revert COMMIT1 COMMIT2 .... For each commit specified on the command line, a new commit will be created that reverts it.

  • For the best results, specify the commits in reverse chronological order, otherwise you face an increased risk of conflicts that you must resolve manually.

  • Git will not automatically revert multiple commits in the proper reverse chronological order. You must specify them in the correct order.

Note

There is no benefit to reverting multiple commits in a single command, except to save typing. If you expect conflicts when reverting multiple commits, it's simpler to revert each commit separately (but still in reverse chronological order).

The restore Command

The restore command reverts changes to working files and/or the index.

To revert one or more working files, making them match the versions in the index, use git restore --worktree -- FILE1 FILE2 ....

  • Switch --worktree is the default, so that's the same as git restore -- FILE1 FILE2 ....

  • The restore command also allows you restore deleted working files, which re-creates them to match the versions in the index.

To revert one or more files in the index, making them match the versions at the tip of the current branch, use git restore --staged -- FILE1 FILE2 .... This does not revert any working files.

To revert one or more files both in the working files and in the index, making them match the versions in the tip of the current branch, use git restore --worktree --staged -- FILE1 FILE2 ....

Regardless of whether you are reverting changes in the working files or the index, you can use switch --source COMMIT to make the reverted files match the versions in an arbitrary COMMIT.

Reverting Changes the Old Way

This section describes the old commands to revert changes. If you only plan to use the new commands, you don't need to read this section. If you do read it, you'll see why the new, simpler commands were developed.

Reverting Changes to Working Files

To revert changes to one or more modified working files, use git checkout -- FILE1 FILE2 .... which makes the files match the versions in the index.

Note

Command git checkout -- FILE overwrites your modifications to FILE, but if you use git checkout BRANCHNAME to switch to a different branch, your modified working files are not overwritten. Yes, this is confusing.

Reverting changes to the Index and Local Repo

To revert changes to the index or the local repo, use command git reset .... Before looking at the reset command, let's discuss the three types of reset it can perform: soft, mixed, and hard. These types of reset differ as follows:

  • A soft reset reverts changes only in the local repo. It destroys commits.

  • A mixed reset reverts changes in the local repo and the index, but not the working files. It destroys commits and reverts staged changes.

  • A hard reset reverts changes in the local repo, the index, and the working files. It destroys commits, reverts staged changes, and reverts changes to working files!

The default type of reset is a mixed reset. Let's look at each type of reset.

Soft Reset

Command git --soft reset performs a soft reset. It takes an existing commit as an argument, and it changes the current branch to reference that commit. So it moves a branch reference to point to a different commit.

  • The specified commit is usually an ancestor of the most recent commit in the branch, so the effect is to revert all commits in the branch that follow the specified commit.

  • The specified commit can be identified by a SHA-1 hash (or its first few digits), a tag name that references the commit, or one of the commit reference syntaxes supported by Git (see section Revision Specification Syntax).

For example, suppose branch master is your current branch and contains these commits:

... <- C21 <- C22 <- C23 <- C24 <- C25 <- C26 <- master <- HEAD

Suppose the SHA-1 hash of C24 starts with f13da7d. The command git reset --soft f13da7d will change the commit graph to look like this:

                              +- C25 <- C26
                              V
... <- C21 <- C22 <- C23 <- C24 <- master <- HEAD

Commits C25 and C26 are now deleted from branch master, which is now a reference to C24. The deleted commits will continue to dangle off the commit graph for a couple of weeks, and then they'll be garbage collected.

If you decide you want to un-delete commit C25 before it's garbage collected, use git reset --soft C25COMMIT, where C25COMMIT is a reference to commit C25, such as its hash or a tag. The commit graph will change to this:

                                   +- C26
                                   V
... <- C21 <- C22 <- C23 <- C24 <- C25 <- master <- HEAD

Thus, a soft reset can both revert commits and restore deleted commits (as long as they haven't been garbage collected).

After you delete commits with a soft reset, the changes in those commits are still in the index and your working files. Your index and working files are now out-of-date with respect to the local repo, because they contain changes that have since been reverted in the local repo.

Why do a soft reset? A common reason is that you committed some code that doesn't work correctly or just isn't ready to be shared with your collaborators yet. After you do a soft reset, you can fix the problems and commit those fixes.

Warning

You should not do a soft reset to delete commits that have already been pushed to a remote repo, because other collaborators may have already pulled those commits and started making changes in their local repos based on them.

Instead, use command git revert ... to create a new commit that undoes the changes. This still impacts your collaborators, so you should warn them about what you're doing, but it's not as bad as undoing changes they have already seen.

Mixed Reset

If you want to both revert commits from the local repo and unstage the changed files from the index, use git reset --mixed COMMIT, where COMMIT is any valid reference to a commit (set section Revision Specification Syntax). A mixed reset is the default, so you can omit the --mixed switch.

  • After a mixed reset, the changes to the files in the deleted commits exist only in your working files. The files in the index exactly match the commit at the tip of the branch.

  • A mixed reset is the default, because when most people revert a commit they typically also want to un-stage the associated files, leaving them with only the modified working files. This is like travelling back in time to a point before the changes were given to Git.

Hard Reset

A hard reset is the most dangerous form of reset. It reverts everything that a mixed reset reverts, plus it reverts changes to your working files!

Just to be clear, when you use git reset --hard COMMIT, it does the following:

  1. Changes the current branch in the local repo to reference COMMIT.

  2. Changes the files from COMMIT in the index to match the versions in COMMIT.

  3. Changes the files from COMMIT in the working files to match the versions in the index and in COMMIT.

Resetting Files

Unlike the soft/mixed/hard resets described above, using git reset with one or more files is different. Resetting a file doesn't move branch references, so it doesn't change the local repo at all. It only reverts changes in the index and, optionally, in working files.

Here are some examples of resetting files:

  1. To unstage a file without changing the working file or the file in the local repo, use one of these commands:

    git reset COMMIT -- FILE
    git reset -- FILE

    This reverts the file in the index to match the specified COMMIT. If you omit COMMIT, it defaults to HEAD, the latest commit in the current branch, which unstages the file from the index (i.e., reverts the file in the index to match the tip of the current branch).

  2. To revert a file in both the index and the working files to match the version in a commit, use either of these commands:

    git reset COMMIT --staged -- FILE
    git reset COMMIT --hard -- FILE

    Here, COMMIT can be HEAD, which means the tip of the current branch. It defaults to HEAD when omitted. The --hard switch is an older syntax that was deprecated in favor of --staged to reduce confusion.

    This command is very dangerous, because it destroys all changes to the specified file both in your working files and in the index.

In both of the above examples, you can specify multiple files in a single command, but it also works to revert them one at a time.

Git Configuration

This section describes how to view and change Git's configuration variables, which control a wide variety of Git's behaviors.

Configuration Files

Git stores configuration variables in plain text configuration files. There are three main configuration files:

  • The system configuration file is /etc/gitconfig on UNIX/Linux systems. On Windows and MacOS, this may be in the Git install directory or somewhere else. This file is usually only writable by someone with administrative privileges, but depending on how you installed Git, you may be able to modify this file without elevated privileges. This file may not exist if no one has configured any system-wide configuration variables.

  • The global configuration file is either ~/.gitconfig or ~/.config/git/config. The global configuration file is located under your home directory, so it is writable by you. It contains configuration variables specific to you. This file may not exist if you have not configured any global configuration variables.

  • The local configuration file is in .git/config in each of your local repos. This contains configuration variables specific to each of your repos. This file is not shared with your collaborators, so it is not committed to the repo.

Of these three files, the local configuration file is the only one that always exists. It's created when you use git init or git clone ... to create your local repo. For instance, the local configuration file stores the URL and short name of the remote repo, among other information.

Variables in the global configuration file take precedence over the ones in the system configuration file. Variables in the local configuration file take precedence over both the global and system configuration files.

Configuration Variables

Configuration variables are stored in the above files as strings of the form name = value. Each name identifies a different configuration variable.

Related configuration variables are collected into named groups. As of Git version 2.38.1, there are 832 configuration variables in 94 groups. For instance, the group named core contains core Git configuration variables, and the group named user contains variables related to you, the user.

Each variable's name is qualified with a prefix that is the group name followed by a dot (.). For instance, the variable named user.email contains your email address, and user.name contains your full name. This information is used by Git when you create a new commit.

The entries for variables user.name and user.email look like this in a configuration file:

[user]
    name = Sue Smith
    email = [email protected]

Most configuration variables do not appear in any configuration file, which means they have their default values or no value at all, such as when user.email has not yet been set.

  • To see a full list of configuration variables along with a description of each, see the output of command git help config.

  • To see a list of the names of every available configuration variable (but not their values), use command git help --config. This is helpful if you can't remember a variable's name or group.

Chapter Customizing Git - Git Configuration in the Pro Git book describes some of the most common configuration variables.

The config Command

Git's configuration files are plain text, so you can view and edit them directly, but it's simpler and more common to use command git config ... to do this.

  • To view the configuration variables that have been set, use git config --list.

  • To view just the system, global, or local configuration variables, add one of the switches --system, --global, or --local to command git config --list.

  • To view the value of a single configuration variable, use git config VARIABLENAME. For instance, git config user.email shows your configured email address. If the variable has not been set, no output will be shown. If the variable has a default value, the default value will be shown.

To set a configuration variable, use command git config VARIABLENAME VALUE. If the VALUE contains whitespace or shell metacharacters, you should quote it.

By default, variables that you set are written to the local configuration file in the current repo. To write them to one of the other configuration files, use git config --global VARABLENAME VALUE or git config --system VARABLENAME VALUE.

For example, before using Git for the first time, you would typically execute these two commands, substituting your name and email address for the ones shown here:

$ git config --global user.name 'Sue Smith'
$ git config --global user.email [email protected]

Switch --global defines these variables in the global configuration file, which causes them to apply to all of your repos. If you were to define them in a local configuration file, they would only apply to that one repo.

Any configuration variable can be defined in any configuration file, but clearly it only make sense to define some variables in certain files.

See the output of git help config for the full command syntax and switches, as well as the documentation for every configuration variable.

Deleting a Configuration Variable

To delete a configuration variable use git config --unset VARIABLENAME. If the variable is in the global or system configuration file, you need to add switch --global or --system, just as when you defined the variable.

Git Aliases

Some Git commands have a complex syntax that can be hard to remember. Or perhaps you wish that some Git commands were shorter to save typing. Git aliases can help.

For example, if you prefer to enter git co instead of git commit, define this alias:

$ git config --global alias.co commit'

Anything you type after git co is appended to the expansion of the alias. For example, the command git co -m "Fixed a bug" expands to git commit -m "Fixed a bug".

As you can see, an alias is just a Git configuration variable in the group named alias. The name of the configuration variable is the alias name. The value of the variable is the expansion of the alias.

Typically, you want access to your aliases in all of your repos, so you should use switch --global to define the alias in your global configuration file. But you can have per-repo aliases by omitting --global (or equivalently by specifying switch --local). To create a per-repo alias, you must enter the command when your current working directory is within that repo.

As another example, if you enter command git log -1 HEAD a lot to show the latest commit in the current branch, you can create this alias:

$ git config --global alias.last 'log -1 HEAD'

The quotes are needed because the alias expansion contains whitespace. This alias let you type git last instead of git log -1 HEAD.

To see a list of your aliases, use command git config --get-regexp alias, or create an alias to list your aliases, like this:

$ git config --global alias.myaliases 'git config --get-regexp alias'

To see a specific alias, use git config alias.NAME, where NAME is the name of the alias.

Stashing

Stashing lets you temporarily hide the changes to your working files and your staged changes in the index. Stashing saves your modified working files and your staged changes on a stack of unfinished changes that you can re-apply later. Each entry on this stack is a stash.

After you create a stash, the modified working files and staged changes are reverted to the last committed versions, hiding your changes.

Use stashing in these cases:

  • You need to switch branches but aren't ready to stage or commit your current work.

  • You want to do different work in the current branch without all of your current changes in place. When you're done, you re-apply the stashed changes to pick up where you left off.

  • You can even apply a stash to a different branch, which is a convenient way to move all your un-committed changes from one branch to another.

Re-applying a Stash

To re-apply a stash, you don't need to have a clean working directory and index, but it is advised. If you have modified working files when you re-apply a stash, Git will merge the stash into your working files, which might create merge conflicts that you have to resolve (see section Merge Conflicts for details).

  • When merge conflicts happen, the stash is not removed from the stack of stashes. After you resolve the merge conflicts, you must manually delete the stash using git drop ... (see section Stash Commands for details).

  • Although your working files and index don't need to be clean to re-apply a stash, they must be the same, otherwise the stash is not re-applied.

Caution

You should not have staged changes in your index when you re-apply a stash, because this can cause loss of those staged changes, requiring you to recover them manually. For best results when re-applying a stash, your working files and index should be clean. See chapter Git Tools - Stashing and Cleaning in the Pro Git book for more information.

Stash Commands

You create a stash from your modified working files and staged files using git stash push -m 'MESSAGE', where MESSAGE is a description of the stash to help you identify it later if you have many stashes.

  • The push sub-command is the default, so it can be omitted: git stash -m 'MESSAGE'.

  • If you omit -m MESSAGE, Git uses a message of the form HASH LAST-COMMIT-MESSAGE, where HASH is the leftmost 7 digits of the hash of the most recent commit on the branch, and LAST-COMMIT-MESSAGE is the message from the most recent commit in the branch.

  • Switch -u also stashes your untracked files, as follows: git stash -u -m 'MESSAGE'. This is helpful if you have untracked compiled binaries that you want to stash along with your source code changes. If you don't stash your untracked binaries, you will need to re-build them after you re-apply the stash.

To see your stashes, use command git stash list. This shows each stash on a separate line, including the stash identifier, the branch it was created from, and the message describing the stash, as follows:

$ git stash list
stash@{0}: On master: Bug fix for issue #42.
stash@{1}: On master: Partial implementation of new backup feature.

The stash identifiers on the left can be used to select which stash to re-apply, as follows: git stash pop stash@{0}. If the stash identifier is omitted (e.g., git stash pop), the most recent stash is re-applied.

Here are some other commonly used stash commands. See the output of git help stash for details.

# Apply a stash (but keep it in the stash list):
$ git stash apply stash@{n} 

# Show a diff of a stash against the commit at the time the stash entry was first created:
$ git stash show stash@{n}

# Show a diff of a stash in patch format:
$ git stash show -p stash@{n}

# Delete a specific stash without applying it (dangerous!):
$ git stash drop stash@{1}

# Delete all stashes without applying them (more dangerous!):
$ git stash clear

Note

Command git stash push is a replacement for the deprecated command git stash save. Command git stash save isn't going away (yet), but you should use push instead of save going forward.

Rebasing

  • UNDER CONSTRUCTION

Multiple Remotes

Git allows you to have more than one remote repo associated with a single local repo. You can add a new remote to your local repo with this command:

$ git remote add SHORTNAME URL

where SHORTNAME is the short name you want to use for the new remote and URL identifies the remote to be added. If the remote is located in a directory on a file server, URL should be the absolute pathname of a file in the root of the remote whose name ends with .git.

You can list the short names and URLs of all remotes associated with your local repo using command git remote -v. Omit switch -v to see just the short names.

Even though your local repo has multiple remotes configured, you can have only one branch checked out at a time, but the branch can be from any of the remotes.

Unlike when you clone a remote repo, adding a new remote does not create the local branches corresponding to each remote branch in the new remote. You must do that manually for each branch, as follows:

$ git fetch REMOTE                            # Fetch branch info from the new remote.
$ git branch --track BRANCH1 REMOTE/BRANCH1   # Create local branch BRANCH1.
$ git branch --track BRANCH2 REMOTE/BRANCH2   # Create local branch BRANCH2.
$ ...

Above, command git fetch REMOTE fetches the branch information from the remote, where REMOTE is the short name of the remote. This must be done first so that Git knows the names of the branches.

Then, command git branch BRANCH1 --track REMOTE/BRANCH1 creates the new local branch BRANCH1 and sets up the tracking relationship with the corresponding remote branch. After doing this, you can switch to any of the new local branches and push/pull commits to/from the remote.

Naming Issues with Multiple Remotes

One issue with using multiple remotes with a single local repo is that your local branch names must be unique, and also the short names of the remotes must be unique.

  • If you add a remote that has a name matching an existing remote's short name, you must use a different short name in command git remote add SHORTNAME URL.

  • If a new remote has a branch name that matches an existing local branch name, you must create the new local branch with a different name using command git branch --track UNIQUENAME REMOTE/BRANCHNAME, where UNIQUENAME is a unique local branch name.

Removing a Remote

If you no longer need one of the remotes, you can remove it with git remote remove REMOTE, where REMOTE is the short name of the remote to remove.

But this does not delete the local branches or the remote-tracking branches associated with the remove that was removed. After you remove the remote, you should remove the local branches and remote-tracking branches as follows (note the capital -D switch, which tells Git to delete a branch regardless of its merge status):

$ git branch -D BRANCH1             # Delete the un-needed local branches.
$ git branch -D BRANCH2
...
$ git branch -D -r REMOTE/BRANCH1   # Delete remote-tracking branches.
$ git branch -D -r REMOTE/BRANCH2
...

If you have a lot of remote-tracking branches associated with the remote that was removed, it may be simpler to do git fetch --prune URL where URL is the URL identifying the remote that was removed. The remote's URL is needed because the remote has already been removed, thus Git no longer knows about the remote. If the URL were omitted, Git would prune all remote-tracking branches but only for known remotes.

Use Cases for Multiple Remotes

Given that you can only have one branch checked out regardless of how many remotes are configured for a local repo, it may seem that it's simpler to use mulitple local repos, one for each remote repo. However, there are a few use cases where it's valuable to have multiple remotes configured.

Use Case: Creating a Pull Request

  • UNDER CONSTRUCTION
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment