Git#
We're going to learn the basics of the most common version control system around: Git.
Note
Git was created Linus Tarvolds, the creator of Linux. Cool, huh?
Git is an industry standard at this point in time. Let's install it on our Ubuntu 20.04 system: sudo apt install git
.
Once that's done, run: git --version
1 2 |
|
Note
You can see mine stays Apple Git-128
. I'll be copy pasting the commands, and their results, based on my macOS system, the system I'm writing this content on. This doesn't really change anything, and everything I'll teach you will be applicable whether you're running Git on Windows, macOS, Linux, Solaris, FreeBSD, or any other system Git runs on.
Before we start exploring Git via its many (many) commands, here are a few things you need to know about Git first:
- What a repository is
- What the staging area is
- What a commit is
- What a commit hash is
- What a branch is
- And what a remote is
Understanding these concepts is enough to begin working with Git on a professional level.
What is a repository?#
A repository is actually just a directory filled with files and other directories. Those files can be code, graphics, binaries, sound files, videos, anything you like.
Directories that are empty are, essentially, ignored.
A valid Git repository is a directory with a .git
folder in it. Inside this special .git
directory is everything the git
tool needs to know about the repository. It uses the information in here to track everything you're doing.
I wouldn't dwell too much on the contents of the directory at this point in time, but as time passes more advanced uses of git
might have you exploring what's in there.
You don't manually create this directory. Instead you use git init
, which we cover below, to make the currently you're in a repository or have git
create a new directory for you and make that new directory a repository.
A repository contains all of your files and Git tracks those files for you.
What is the staging area?#
When you're managing a git repository, the staging area is where you can get changes ready to be committed into the repository. The staging area is like a table that you place things on, then you organise them, sort them, label them, etc., and finally you move (commit) them to their final location on another, final, immutable shelf.
In Git, we use the staging area to prepare specific changes we want to commit, once and for all, into the repository's commit log. We'll explore commits next.
Here are the advantages that a staging area gives us:
- If you have a lot of changes, you can group them into separate commits
- Before committing your changes, you can review them and ensure they're what you expect
- With a staging area, you can keep local files present with them being managed by Git
- A staging area lets you work on something small and quick whilst you're in the middle of other changes.
Overall, the staging area is a great tool and feature of Git.
We'll use this the staging later on.
What is a commit?#
This is an interesting topic.
Git can be boiled down to a filesystem tracker, or put another way: Git tracks the changes made to the contents and structure of a repository. It stores all the information about those changes inside of a commit.
A commit is a cryptographically secure entry in the repository log of changes. It's "cryptographically secure" because the git
tool generates a cryptographic hash that represents the changes made inside of the commit. This hash then goes into the Git log. It's guaranteed to be a unique hash, never to be seen twice in the same repository (and maybe even globally), and it uniquely identifies changes in the repository.
This might be confusing right now, but it'll become clearer over time.
For now, think of a commit as a snapshot of the changes you're committing to the repository and the hash as an identifier attached to those changes so they can be tracked and reviewed later on.
We'll use the git commit
command below to explore this idea a bit more.
What is a commit hash?#
When you produce commits in the Git repository using git commit
, Git computes as hash that represents the changes being made and committed. This hash uses mathematic functions to produce a unique value.
This value, called a commit hash, identifies a commit from the staging area and into the commit log. It's unique (possibly globally!) and it's how you "navigate" inside of the repository, or put another way: go back in time.
You'll see mention of commit hashes here and there and even more so as you move through your career.
What is a branch?#
One of the key benefits Git provides over previous (and current) version control systems is the ability to create a branch of the current repository and make changes independently and cheaply - branches are cheap in that they're extremely minimal and easy for Git to provide, track and work with.
Some previous systems didn't offer this functionality as well as Git, and as such, Git has surpassed them (it being developed by the inventor of Linux probably also helped, too.)
Imagine we have a file called main.py
and it has this Python code inside it:
1 |
|
Now imagine someone wants to make changes to it without it effecting my code. In the exact same repository, they can create a new branch, make a change:
1 |
|
And git commit
and git push
that new branch, with its changes, to the remote Git repository storage. I can then download the repository and I'll have two copies of the code that are completely independent of each one: my original file that prints, "Hello, world!" and another branch that print, "Goodnight, everyone!".
We can merge those two branches later on after we decide who's code is better.
Branching can seem complicated, but it's actually really simple. It becomes clearer after you've played with it a bit more. We'll do that below.
What is a remote?#
When it comes to Git, a "remote" is a remote location that is used to store your Git repository. You use git push
to literally "push" your code to a remote, allowing you to back your work up on a remote, backed up system and share the code with or people (or not.)
Git is a decentralised version control system. It's branching capabilities combined with the ability to push to remote locations(multiple, in fact!) means it's decentralised - multiple people can work on a Git repository at the same time and merge their works later on.
We'll cover remotes when we explore GitHub(.com).
Sub-commands#
The general available options and usage syntax of the git
command can be found by simply running the command without flags or sub-commands: git
1 2 3 4 5 |
|
We can also see what sub-commands git
has available for us to play with further down in this output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
These are relevant for the version of git
I have installed at the time of writing.
Now you might be thinking that's a lot of sub-commands, and you're not wrong. But we're only going to look at a few of them for now, and a bit more in the next section when we look at GitHub.