Introduction to Git

Contents

Git is currently the most popular implementation of a distributed version control system.

Git originates from the Linux kernel development and was founded in 2005 by Linus Torvalds. Nowadays it is used by many popular open source projects, e.g., the Android or the Eclipse developer teams, as well as many commercial organizations.

The core of Git was originally written in the programming language C, but Git has also been re-implemented in other languages, e.g., Java, Ruby and Python.

2.2. Git repositories

A Git repository contains the history of a collection of files starting from a certain directory. The process of copying an existing Git repository via the Git tooling is called cloning. After cloning a repository the user has the complete repository with its history on his local machine. Of course, Git also supports the creation of new repositories.

If you want to delete a Git repository, you can simply delete the folder which contains the repository.

If you clone a Git repository, by default, Git assumes that you want to work in this repository as a user. Git also supports the creation of repositories targeting the usage on a server.

bare repositories are supposed to be used on a server for sharing changes coming from different developers. Such repositories do not allow the user to modify locally files and to create new versions for the repository based on these modifications.
non-bare repositories target the user. They allow you to create new changes through modification of files and to create new versions in the repository. This is the default type which is created if you do not specify any parameter during the clone operation.

A local non-bare Git repository is typically called local repository.

2.3. Working tree

A local repository provides at least one collection of files which originate from a certain version of the repository. This collection of files is called the working tree. It corresponds to a checkout of one version of the repository with potential changes done by the user.

The user can change the files in the working tree by modifying existing files and by creating and removing files.

A file in the working tree of a Git repository can have different states. These states are the following:

untracked: the file is not tracked by the Git repository. This means that the file never staged nor committed.
tracked: committed and not staged
staged: staged to be included in the next commit
dirty / modified: the file has changed but the change is not staged

After doing changes in the working tree, the user can add these changes to the Git repository or revert these changes.

2.4. Adding to a Git repository via staging and committing

After modifying your working tree you need to perform the following two steps to persist these changes in your local repository:

add the selected changes to the staging area (also known as index) via the git add command
commit the staged changes into the Git repository via the git commit command

This process is depicted in the following graphic.

Git commit process

The git add command stores a snapshot of the specified files in the staging area. It allows you to incrementally modify files, stage them, modify and stage them again until you are satisfied with your changes.

Some tools and Git user prefer the usage of the index instead of staging area. Both terms mean the same thing.

After adding the selected files to the staging area, you can commit these files to add them permanently to the Git repository. _ Committing_ creates a new persistent snapshot (called commit or commit object) of the staging area in the Git repository. A commit object, like all objects in Git, is immutable.

The staging area keeps track of the snapshots of the files until the staged changes are committed.

For committing the staged changes you use the git commit command.

If you commit changes to your Git repository, you create a new commit object in the Git repository. See Commit object (commit) for information about the commit object.

2.5. Synchronizing with other Git repositories (remote repositories)

Git allows the user to synchronize the local repository with other (remote) repositories.

Users with sufficient authorization can send new version in their local repository to to remote repositories via the push operation. They can also integrate changes from other repositories into their local repository via the fetch and pull operation.

2.6. The concept of branches

Git supports branching which means that you can work on different versions of your collection of files. A branch allows the user to switch between these versions so that he can work on different changes independently from each other.

For example, if you want to develop a new feature, you can create a branch and make the changes in this branch. This does not affect the state of your files in other branches. For example, you can work independently on a branch called production for bugfixes and on another branch called feature_123for implementing a new feature.

Branches in Git are local to the repository. A branch created in a local repository does not need to have a counterpart in a remote repository. Local branches can be compared with other local branches and with _remote-tracking branches. A remote-tracking branch proxies the state of a branch in another remote repository.

Git supports the combination of changes from different branches. The developer can use Git commands to combine the changes at a later point in time.

2.7. Summary of the core Git terminology

The following table provides a summary of important Git terminology discussed in this section.

Term	Definition
Branch	A branch is a named pointer to a commit. Selecting a branch in Git terminology is called to checkout a branch. If you are working in a certain branch, the creation of a new commit advances this pointer to the newly created commit.Each commit knows their parents (predecessors). Successors are retrieved by traversing the commit graph starting from branches or other refs, symbolic references (for example: HEAD) or explicit commit objects. This way a branch defines its own line of descendants in the overall version graph formed by all commits in the repository.You can create a new branch from an existing one and change the code independently from other branches. One of the branches is the default (typically named _master ). The default branch is the one for which a local branch is automatically created when cloning the repository.
Commit	When you commit your changes into a repository this creates a new commit object in the Git repository. This commit object uniquely identifies a new revision of the content of the repository.This revision can be retrieved later, for example, if you want to see the source code of an older version. Each commit object contains the author and the committer. This makes it possible to identify who did the change. The author and committer might be different people. The author did the change and the committer applied the change to the Git repository. This is common for contributions to open source projects.
HEAD	HEAD is a symbolic reference most often pointing to the currently checked out branch.Sometimes the HEAD points directly to a commit object, this is called detached HEAD mode. In that state creation of a commit will not move any branch.If you switch branches, the HEAD pointer points to the branch pointer which in turn points to a commit. If you checkout a specific commit, theHEAD points to this commit directly.
Index	Index is an alternative term for the staging area.
Repository	A repository contains the history, the different versions over time and all different branches and tags. In Git each copy of the repository is a complete repository. If the repository is not a bare repository, it allows you to checkout revisions into your working tree and to capture changes by creating new commits. Bare repositories are only changed by transporting changes from other repositories.This description uses the term repository to talk about a non-bare repository. If it talks about a bare repository, this is explicitly mentioned.
Revision	Represents a version of the source code. Git implements revisions as commit objects (or short commits ). These are identified by an SHA-1 hash.
Staging area	The staging area is the place to store changes in the working tree before the commit. The staging area contains a snapshot of the changes in the working tree (changed or new files) relevant to create the next commit and stores their mode (file type, executable bit).
Tag	A tag points to a commit which uniquely identifies a version of the Git repository. With a tag, you can have a named point to which you can always revert to. You can revert to any point in a Git repository, but tags make it easier. The benefit of tags is to mark the repository for a specific reason, e.g., with a release.Branches and tags are named pointers, the difference is that branches move when a new commit is created while tags always point to the same commit. Tags can have a timestamp and a message associated with them.
URL	A URL in Git determines the location of the repository. Git distinguishes between fetchurl for getting new data from other repositories and pushurlfor pushing data to another repository.
Working tree	The working tree contains the set of working files for the repository. You can modify the content and commit the changes as new commits to the repository.

Post Views: 1,343

2. Introduction into Git

2.1. What is Git?