Part 9: Git & Version Control Foundations

Introduction — Why Version Control is Non-Negotiable

Imagine writing a 500-page novel without the ability to undo. No drafts, no revisions, no way to compare today's version with last week's. Now imagine writing that novel with 20 co-authors simultaneously, each working on different chapters that reference each other. Without version control, this is impossible. With it, millions of developers collaborate on codebases containing millions of lines of code every single day.

Version control is not optional. It is the single most important tool in a software engineer's toolkit. Every other practice in this series — CI/CD, code review, deployment, rollback — depends on it.

A Brief History of Version Control

Year	System	Model	Key Innovation
1982	RCS	Local	File-level versioning on a single machine
1990	CVS	Centralized	Multi-file projects, network access
2000	SVN	Centralized	Atomic commits, directory versioning
2005	Git	Distributed	Full local repository, SHA-1 integrity, branching as core concept

Git was created by Linus Torvalds in April 2005 for Linux kernel development after the proprietary BitKeeper tool revoked its free license. Torvalds designed Git with three goals: speed, data integrity, and support for distributed, non-linear workflows. Within a decade, Git became the de facto standard — today, over 95% of professional developers use Git.

                            
                            Key Insight: Git's design philosophy is fundamentally different from its predecessors. Where CVS/SVN track changes to files (deltas), Git stores snapshots of the entire project at each commit. This seemingly small difference enables Git's speed, its branching model, and its ability to work entirely offline.
                        

Centralized vs Distributed VCS

Understanding the architectural difference between centralized and distributed version control is crucial for understanding why Git works the way it does.

The Centralized Model (SVN)

In a centralized VCS, there is one authoritative server. All history lives on that server. Developers "check out" a working copy, make changes, and "commit" back to the server. If the server is down, no one can commit. If the server's disk fails without backup, all history is lost.

The Distributed Model (Git)

In a distributed VCS, every developer has a complete copy of the entire repository — all history, all branches, everything. There is no single point of failure. Developers can commit, branch, merge, and view history entirely offline. "Pushing" and "pulling" are merely syncing between equal peers.

Centralized vs Distributed Version Control

flowchart TB
    subgraph Centralized["Centralized VCS (SVN)"]
        S[Central Server
Full History] --- WC1[Working Copy 1
No History]
        S --- WC2[Working Copy 2
No History]
        S --- WC3[Working Copy 3
No History]
    end
    subgraph Distributed["Distributed VCS (Git)"]
        R1[Repo 1
Full History] --- R2[Repo 2
Full History]
        R2 --- R3[Repo 3
Full History]
        R1 --- R3
    end

Why distributed won:

Speed — All operations (log, diff, blame, branch) are local. No network round-trips
Resilience — No single point of failure. Every clone is a full backup
Offline work — Commit, branch, and merge on a plane without internet
Branching freedom — Branches are cheap (just a 41-byte file). Developers create branches liberally
Flexible workflows — Teams choose their own collaboration model (centralized, forking, etc.)

Git Fundamentals — The Three-Tree Architecture

Git manages your code through three "trees" (areas). Understanding these three areas is the key to understanding every Git command:

Git's Three-Tree Architecture

flowchart LR
    WD[Working Directory
Your actual files] -->|git add| SA[Staging Area
Index / Next commit snapshot]
    SA -->|git commit| R[Repository
.git/objects/]
    R -->|git checkout| WD

Area	What It Contains	Purpose
Working Directory	Your actual files on disk	Where you edit code. Files can be tracked or untracked
Staging Area (Index)	A snapshot of what will go into the next commit	Lets you craft commits precisely — include some changes, exclude others
Repository (.git/)	Complete history of all committed snapshots	Permanent record. Immutable objects identified by SHA-1 hash

Snapshots, Not Diffs

Most version control systems store data as a list of file-based changes (deltas). Git is different — it stores a complete snapshot of the project at each commit. If a file hasn't changed, Git stores a pointer to the previous identical file (deduplication via content addressing). This is why Git can switch between branches almost instantly — it just swaps pointer sets, not reconstructing files from chains of deltas.

Essential Commands

Initialising & Cloning

# Create a brand new Git repository in the current directory
mkdir my-project
cd my-project
git init

# Verify the repository was created
ls -la .git/
echo "Repository initialised successfully"

# Clone an existing repository from a remote URL
git clone https://github.com/torvalds/linux.git

# Clone with a custom local directory name
git clone https://github.com/torvalds/linux.git my-linux-copy

echo "Repository cloned successfully"

Staging & Committing

# Create a new file
echo "Hello, Git!" > README.md

# Check the status — file is untracked
git status

# Stage the file (add to index)
git add README.md

# Check status again — file is now staged
git status

# Commit the staged changes with a message
git commit -m "feat: add initial README"

# View the commit log
git log --oneline
echo "First commit created"

# Stage all modified and new files at once
git add .

# Stage specific files
git add src/main.js src/utils.js

# Stage parts of a file interactively (choose hunks)
git add -p src/main.js

# Commit with a multi-line message
git commit -m "fix: resolve null pointer in user service

The getUserById method was not handling the case where
the user ID doesn't exist in the database. Added a null
check and returns a 404 response instead of crashing."

echo "Commit with detailed message created"

# View commit history with various formats
# One-line summary
git log --oneline

# Graphical branch view
git log --oneline --graph --all

# Show last 5 commits with file changes
git log -5 --stat

# Show commits by a specific author
git log --author="Wasil" --oneline

echo "Log commands demonstrated"

Inspecting Changes with git diff

# See unstaged changes (working directory vs staging area)
git diff

# See staged changes (staging area vs last commit)
git diff --staged

# See all changes since last commit (working + staged)
git diff HEAD

# Compare two branches
git diff main..feature-branch

# Show only file names that changed
git diff --name-only main..feature-branch

echo "Diff commands demonstrated"

.gitignore Patterns

# Create a comprehensive .gitignore file
cat > .gitignore << 'EOF'
# Dependencies
node_modules/
vendor/
.venv/

# Build outputs
dist/
build/
*.o
*.class

# IDE files
.idea/
.vscode/
*.swp

# OS files
.DS_Store
Thumbs.db

# Environment files (NEVER commit secrets)
.env
.env.local
*.key
*.pem

# Logs
*.log
logs/
EOF

git add .gitignore
git commit -m "chore: add .gitignore"
echo ".gitignore configured"

Branching — Git's Killer Feature

Branching in Git is incredibly cheap. A branch is nothing more than a 41-byte file containing the SHA-1 hash of the commit it points to. Creating a branch doesn't copy any files — it just creates a new pointer. This is why Git developers branch constantly and freely.

What Branches Really Are

A branch is a movable pointer to a commit. When you make a new commit on a branch, the pointer automatically advances to the new commit. The special pointer HEAD tells Git which branch you're currently on.

                            
                            Key Insight: In SVN, creating a branch copied the entire project directory. In Git, creating a branch writes 41 bytes to disk. This fundamental difference is why Git enables workflows (feature branches, release branches, hotfix branches) that would be impractical in older systems.
                        

Branch Commands

# List all local branches (* marks current branch)
git branch

# Create a new branch (doesn't switch to it)
git branch feature/user-auth

# Create and switch to a new branch in one command
git checkout -b feature/payment-flow

# Modern alternative (Git 2.23+): git switch
git switch -c feature/notifications

# List all branches including remote
git branch -a

# Delete a merged branch
git branch -d feature/user-auth

# Force-delete an unmerged branch (careful!)
git branch -D experiment/failed-approach

echo "Branch operations demonstrated"

# Switch between existing branches
git checkout main
git checkout feature/payment-flow

# Modern alternative: git switch
git switch main
git switch feature/payment-flow

# See which branch you're on
git branch --show-current

# See the commit each branch points to
git branch -v

echo "Branch switching demonstrated"

Merging

Merging integrates changes from one branch into another. Git supports two types of merge, and understanding when each occurs is crucial.

Fast-Forward Merge

A fast-forward merge happens when there are no divergent commits on the target branch. Git simply moves the branch pointer forward — no merge commit is created.

# Example: Fast-forward merge
# Create and switch to feature branch
git checkout -b feature/add-logging

# Make some commits on the feature branch
echo "import logging" > logger.py
git add logger.py
git commit -m "feat: add logging module"

echo "logging.info('App started')" >> logger.py
git add logger.py
git commit -m "feat: add startup log message"

# Switch back to main (no new commits since branch)
git checkout main

# Merge — this will fast-forward
git merge feature/add-logging

# Result: main now points to the same commit as the feature branch
git log --oneline --graph
echo "Fast-forward merge complete"

Three-Way Merge

A three-way merge happens when both branches have diverged — each has commits the other doesn't. Git finds the common ancestor, compares both branches against it, and creates a new merge commit with two parents.

# Example: Three-way merge
git checkout main

# Make a commit on main
echo "# Main README" > README.md
git add README.md
git commit -m "docs: update README on main"

# The feature branch also has commits that main doesn't have
# So merging creates a merge commit
git merge feature/add-logging -m "Merge feature/add-logging into main"

# View the graph — you'll see the merge commit with two parents
git log --oneline --graph --all
echo "Three-way merge complete"

Fast-Forward vs Three-Way Merge

gitGraph
    commit id: "A"
    commit id: "B"
    branch feature
    commit id: "C"
    commit id: "D"
    checkout main
    commit id: "E"
    merge feature id: "Merge"

Rebasing

Rebasing is an alternative to merging. Instead of creating a merge commit, it replays your commits on top of another branch, creating a linear history. The result contains the same changes, but the commit history looks as if you'd started your work from the latest point on the target branch.

What Rebase Does

# Scenario: You're on feature/search, main has moved ahead
git checkout feature/search

# Rebase onto main — replay your commits on top of main's latest
git rebase main

# Your commits now appear AFTER main's latest commits
# The history is linear — no merge commit
git log --oneline --graph

echo "Rebase complete — linear history achieved"

Interactive Rebase

Interactive rebase (git rebase -i) lets you rewrite history: squash commits, reorder them, edit messages, or drop commits entirely. This is essential for cleaning up work-in-progress commits before merging to main.

# Interactive rebase: clean up the last 4 commits
git rebase -i HEAD~4

# This opens an editor with your commits:
# pick abc1234 WIP: start search feature
# pick def5678 fix typo
# pick ghi9012 WIP: search almost working
# pick jkl3456 feat: complete search feature
#
# You can change 'pick' to:
#   squash (s) — combine with previous commit
#   reword (r) — change commit message
#   edit (e)   — pause to amend the commit
#   drop (d)   — remove the commit entirely
#
# Example result after editing:
# pick abc1234 WIP: start search feature
# squash def5678 fix typo
# squash ghi9012 WIP: search almost working
# reword jkl3456 feat: complete search feature

echo "Interactive rebase instructions shown"

The Golden Rule of Rebasing

                            
                            Golden Rule: Never rebase commits that have been pushed to a shared branch. Rebasing rewrites commit hashes. If others have based work on those commits, their history will diverge from yours, creating a mess that's painful to resolve. Only rebase local commits that haven't been shared yet.
                        

When to rebase vs merge:

Rebase — Cleaning up local feature branch commits before merging to main. Keeping a linear history
Merge — Integrating completed features into main. Preserving the full context of how work happened in parallel

Conflict Resolution

Conflicts occur when Git cannot automatically determine how to combine changes. This happens when two branches modify the same line(s) of the same file. Git pauses the merge/rebase and asks you to resolve the conflict manually.

Why Conflicts Happen

Two developers edit the same function in different branches
One branch deletes a file that another branch modifies
Both branches add content at the same location in a file
A rebase tries to apply a patch that no longer applies cleanly

Reading Conflict Markers

# When a conflict occurs, Git marks the file like this:
cat conflicted-file.js
# Output:
# function calculateTotal(items) {
# <<<<<<< HEAD
#     return items.reduce((sum, item) => sum + item.price, 0);
# =======
#     return items.reduce((sum, item) => sum + item.price * item.quantity, 0);
# >>>>>>> feature/quantity-support
# }

# <<<<<<< HEAD        = Your current branch's version
# =======             = Separator
# >>>>>>> branch-name = The incoming branch's version

# To resolve: edit the file to contain the correct final version
# Then:
git add conflicted-file.js
git commit -m "fix: resolve merge conflict in calculateTotal"

echo "Conflict resolution demonstrated"

Tooling

Merge Tools for Conflict Resolution

While you can resolve conflicts in any text editor, dedicated merge tools provide a three-pane view showing: (1) your version, (2) their version, and (3) the common ancestor. This makes complex conflicts much easier to understand. Popular options: VS Code (built-in merge editor), meld (Linux), kdiff3 (cross-platform), and Beyond Compare (commercial). Configure with: git config --global merge.tool vscode

VS Code Meld Three-Way Diff

Remote Repositories

Remotes are bookmarks to other copies of the repository. The most common remote is called origin — this is typically the repository you cloned from (on GitHub, GitLab, etc.).

# List configured remotes
git remote -v

# Add a new remote
git remote add upstream https://github.com/original/repo.git

# Fetch updates from a remote (download, don't merge)
git fetch origin

# Pull = fetch + merge (updates your current branch)
git pull origin main

# Push your local commits to the remote
git push origin feature/my-work

# Push and set upstream tracking
git push -u origin feature/my-work

echo "Remote operations demonstrated"

Tracking Branches

# See which local branches track which remote branches
git branch -vv

# Set up tracking for an existing branch
git branch --set-upstream-to=origin/main main

# After tracking is set, simple 'git pull' and 'git push' work
# without specifying the remote and branch name
git pull   # pulls from tracked remote branch
git push   # pushes to tracked remote branch

echo "Tracking branches configured"

Undoing Things

One of Git's greatest strengths is the ability to undo almost anything. But different situations call for different undo mechanisms.

# Unstage a file (remove from staging area, keep changes in working dir)
git restore --staged accidental-file.js

# Discard changes in working directory (DESTRUCTIVE — changes are lost!)
git restore accidental-file.js

# Amend the last commit (fix message or add forgotten files)
git add forgotten-file.js
git commit --amend -m "feat: add search with forgotten file included"

echo "Basic undo operations demonstrated"

# git reset — move the branch pointer backward
# Soft reset: uncommit, keep changes staged
git reset --soft HEAD~1

# Mixed reset (default): uncommit, keep changes in working directory
git reset HEAD~1

# Hard reset (DESTRUCTIVE): uncommit, discard all changes
git reset --hard HEAD~1

echo "Reset modes demonstrated (be careful with --hard!)"

# git revert — create a NEW commit that undoes an old commit
# Safe for shared branches because it doesn't rewrite history
git revert abc1234

# Revert a merge commit (must specify parent)
git revert -m 1 merge-commit-hash

# Revert multiple commits
git revert HEAD~3..HEAD

echo "Revert operations demonstrated (safe for shared branches)"

Danger Zones — When Undo Is Destructive

Command	Rewrites History?	Can Lose Data?	Safe on Shared Branches?
`git restore file`	No	Yes — discards uncommitted changes	N/A (local only)
`git reset --soft`	Yes	No — changes stay staged	No
`git reset --hard`	Yes	Yes — all changes lost	No
`git revert`	No	No — creates new commit	Yes
`git rebase`	Yes	No (rebased commits exist)	No
`git push --force`	Yes (remote)	Yes — others' work can be lost	Never

Best Practices

Atomic Commits

Each commit should represent one logical change. If you can describe a commit in one sentence without using "and", it's probably atomic. Benefits: easier bisecting, cleaner reverts, simpler code review.

Meaningful Commit Messages

The Conventional Commits standard provides a structured format that enables automated changelogs and semantic versioning:

# Conventional Commits format:
# type(scope): subject
#
# Types: feat, fix, docs, style, refactor, test, chore, perf, ci
#
# Examples:
git commit -m "feat(auth): add OAuth2 login with Google"
git commit -m "fix(cart): prevent negative quantity on item update"
git commit -m "docs(api): add OpenAPI spec for /users endpoint"
git commit -m "refactor(db): extract connection pooling to shared module"
git commit -m "test(payment): add integration tests for Stripe webhook"
git commit -m "chore(deps): upgrade Express from 4.18 to 4.19"

echo "Conventional Commits examples shown"

Never Commit Secrets

                            
                            Critical Rule: Never commit passwords, API keys, tokens, or private keys to a Git repository — even a private one. Git history is permanent. Even if you "delete" the file in a later commit, the secret remains in history forever. Use .env files (in .gitignore), environment variables, or secret managers. If you accidentally commit a secret, consider it compromised — rotate the credential immediately, then use git filter-repo to remove it from history.
                        

Exercises

Exercise 1

Repository from Scratch

Create a new Git repository. Add a README.md, a .gitignore for your favourite language, and at least 3 commits with meaningful Conventional Commits messages. Use git log --oneline --graph to verify your history looks clean.

Init Commit Messages

Exercise 2

Conflict Resolution Practice

Create two branches from main. On both branches, edit the same line of the same file differently. Merge one branch into main, then try to merge the second. Resolve the conflict manually. Verify the result with git log --oneline --graph --all.

Conflict Merge

Exercise 3

Rebase & Squash

Create a feature branch with 5 small "WIP" commits. Use interactive rebase (git rebase -i HEAD~5) to squash them into 2 clean commits with proper messages. Verify the result shows only 2 new commits in the log.

Rebase History Cleanup

Exercise 4

Undo Everything

Practice all undo mechanisms: (1) Unstage a file with git restore --staged. (2) Discard working directory changes with git restore. (3) Undo the last commit with git reset --soft HEAD~1. (4) Revert a commit with git revert. Document what each command does to the three trees (working directory, staging area, repository).

Undo Reset vs Revert

Conclusion & Next Steps

Git is deceptively deep. The commands covered in this article — init, add, commit, branch, merge, rebase, and the various undo mechanisms — represent the daily workflow of every developer. But mastering these fundamentals gives you confidence in any Git situation because you understand the underlying model: three trees, snapshots, pointers, and a directed acyclic graph.

Key takeaways:

Git stores snapshots, not diffs — This enables fast branching and offline operation
The staging area is your friend — It lets you craft precise, atomic commits
Branches are cheap — Create them liberally for every feature, bug fix, and experiment
Rebase for clean history, merge for truth — Use each where appropriate
Know your undo options — revert is safe for shared branches; reset is for local cleanup
Never rebase shared branches — The golden rule prevents team chaos
Never commit secrets — Use .gitignore, .env files, and secret managers

Next in the Series

In Part 10: Git Internals — The Object Model & DAG, we'll go beneath the surface to understand how Git actually works — blobs, trees, commits, SHA-1 hashing, packfiles, and the directed acyclic graph. Understanding the plumbing makes you a Git power user who never gets stuck.

Previous Part 8: Implementation, Buy vs Build Next Part 10: Git Internals

Cookie Consent