Back to Technology

Part 2: Digital Logic & CPU Building Blocks

January 31, 2026 Wasil Zafar 22 min read

Understand the fundamental building blocks of CPUs—from logic gates to ALU, registers, datapath, and microarchitecture design.

Table of Contents

  1. Introduction
  2. Logic Gates
  3. Combinational Circuits
  4. Sequential Circuits
  5. Datapath Design
  6. Microarchitecture
  7. Conclusion & Next Steps

Introduction

Digital logic forms the foundation of all computer hardware. Every operation a CPU performs—from simple addition to complex floating-point calculations—ultimately relies on combinations of basic logic gates.

Series Context: This is Part 2 of 24 in the Computer Architecture & Operating Systems Mastery series. Building on Part 1's system overview, we now dive into the hardware building blocks.

Computer Architecture & OS Mastery

Your 24-step learning path • Currently on Step 2
1
Part 1: Foundations of Computer Systems
System overview, architectures, OS role
2
Digital Logic & CPU Building Blocks
Gates, registers, datapath, microarchitecture
You Are Here
3
Instruction Set Architecture (ISA)
RISC vs CISC, instruction formats, addressing
4
Assembly Language & Machine Code
Registers, stack, calling conventions
5
Assemblers, Linkers & Loaders
Object files, ELF, dynamic linking
6
Compilers & Program Translation
Lexing, parsing, code generation
7
CPU Execution & Pipelining
Fetch-decode-execute, hazards, prediction
8
OS Architecture & Kernel Design
Monolithic, microkernel, system calls
9
Processes & Program Execution
Process lifecycle, PCB, fork/exec
10
Threads & Concurrency
Threading models, pthreads, race conditions
11
CPU Scheduling Algorithms
FCFS, RR, CFS, real-time scheduling
12
Synchronization & Coordination
Locks, semaphores, classic problems
13
Deadlocks & Prevention
Coffman conditions, Banker's algorithm
14
Memory Hierarchy & Cache
L1/L2/L3, cache coherence, NUMA
15
Memory Management Fundamentals
Address spaces, fragmentation, allocation
16
Virtual Memory & Paging
Page tables, TLB, demand paging
17
File Systems & Storage
Inodes, journaling, ext4, NTFS
18
I/O Systems & Device Drivers
Interrupts, DMA, disk scheduling
19
Multiprocessor Systems
SMP, NUMA, cache coherence
20
OS Security & Protection
Privilege levels, ASLR, sandboxing
21
Virtualization & Containers
Hypervisors, namespaces, cgroups
22
Advanced Kernel Internals
Linux subsystems, kernel debugging
23
Case Studies
Linux vs Windows vs macOS
24
Capstone Projects
Shell, thread pool, paging simulator

In this guide, we'll build up from the simplest components—logic gates—to complete CPU subsystems. By the end, you'll understand how billions of transistors combine to form the brain of your computer.

The Building Analogy

Real-World Analogy

Think of building a CPU like building with LEGOs:

  • Logic Gates — Individual LEGO bricks (the smallest pieces)
  • Combinational Circuits — Small assemblies (a wheel, a window)
  • Sequential Circuits — Assemblies with moving parts (a door that opens)
  • Datapath — Major sections (the frame of a car)
  • CPU — The complete model (a finished LEGO car)

Each level uses the pieces from the level below!

Logic Gates: The Fundamental Building Blocks

A logic gate is an electronic circuit that performs a basic logical operation. It takes one or more binary inputs (0 or 1) and produces a single binary output based on a specific rule.

Key Insight: Every computation your computer performs—from watching videos to running AI—ultimately reduces to billions of logic gates flipping between 0 and 1, billions of times per second!

The Three Basic Gates: AND, OR, NOT

All digital logic can be built from combinations of these three fundamental gates:

NOT Gate (Inverter)

The simplest gate—it flips the input. If input is 1, output is 0, and vice versa.

Symbol:        Truth Table:
           ─┬─┐          ┌───┬───┐
    A ──────│ ○────Q     │ A │ Q │
           ─┴─┘          ├───┼───┤
                         │ 0 │ 1 │
    Q = NOT A            │ 1 │ 0 │
    Q = Ā (A bar)        └───┴───┘
    Q = !A (programming)

AND Gate

Outputs 1 only if both inputs are 1. Think of it as: "Both must be true."

Symbol:        Truth Table:
           ─┬───┐        ┌───┬───┬───┐
    A ──────│   │        │ A │ B │ Q │
           ─┤   ├───Q    ├───┼───┼───┤
    B ──────│   │        │ 0 │ 0 │ 0 │
           ─┴───┘        │ 0 │ 1 │ 0 │
                         │ 1 │ 0 │ 0 │
    Q = A AND B          │ 1 │ 1 │ 1 │
    Q = A · B            └───┴───┴───┘
    Q = A && B (programming)

Real-World AND Gate

A car engine that starts only when:

  • Key is turned (Input A = 1) AND
  • Brake pedal is pressed (Input B = 1)

Both conditions must be true for the engine to start (Output = 1).

OR Gate

Outputs 1 if at least one input is 1. Think of it as: "Either (or both) can be true."

Symbol:        Truth Table:
           ─┬───╲        ┌───┬───┬───┐
    A ──────│    ╲       │ A │ B │ Q │
           ─┤    )──Q    ├───┼───┼───┤
    B ──────│    ╱       │ 0 │ 0 │ 0 │
           ─┴───╱        │ 0 │ 1 │ 1 │
                         │ 1 │ 0 │ 1 │
    Q = A OR B           │ 1 │ 1 │ 1 │
    Q = A + B            └───┴───┴───┘
    Q = A || B (programming)

Universal Gates: NAND and NOR

NAND (NOT-AND) and NOR (NOT-OR) are called universal gates because any logic function can be built using only NAND gates or only NOR gates!

NAND Gate

NAND = NOT(AND)

Truth Table:           Building other gates from NAND:
┌───┬───┬───┐
│ A │ B │ Q │          NOT from NAND:   A ──┬──[NAND]──Q
├───┼───┼───┤                              └───┘
│ 0 │ 0 │ 1 │          (Connect same input to both)
│ 0 │ 1 │ 1 │
│ 1 │ 0 │ 1 │          AND from NAND:   A ──┬──[NAND]──┬──[NAND]──Q
│ 1 │ 1 │ 0 │          B ──┘              └───┘
└───┴───┴───┘
Why NAND is King: In real chip manufacturing, NAND gates are the cheapest and fastest to produce. Most modern CPUs are built primarily from NAND gates, with other gates synthesized from them!

NOR Gate

NOR = NOT(OR)

Truth Table:
┌───┬───┬───┐
│ A │ B │ Q │
├───┼───┼───┤
│ 0 │ 0 │ 1 │   ← Only outputs 1 when BOTH inputs are 0
│ 0 │ 1 │ 0 │
│ 1 │ 0 │ 0 │
│ 1 │ 1 │ 0 │
└───┴───┴───┘

XOR and XNOR Gates

XOR (Exclusive OR)

Outputs 1 when inputs are different. Essential for arithmetic and comparison!

Truth Table:           Applications:
┌───┬───┬───┐
│ A │ B │ Q │          • Addition (carry detection)
├───┼───┼───┤          • Parity checking
│ 0 │ 0 │ 0 │          • Comparison (are inputs different?)
│ 0 │ 1 │ 1 │          • Encryption (XOR with key)
│ 1 │ 0 │ 1 │          • Toggle bits
│ 1 │ 1 │ 0 │
└───┴───┴───┘

Q = A ⊕ B    (XOR symbol)
Q = A ^ B    (programming)

XOR Does Binary Addition!

Notice how XOR matches single-bit addition (ignoring carry):

  0 + 0 = 0    (0 XOR 0 = 0)
  0 + 1 = 1    (0 XOR 1 = 1)
  1 + 0 = 1    (1 XOR 0 = 1)
  1 + 1 = 10   (1 XOR 1 = 0, with carry!)

This is why XOR is fundamental to building adders!

XNOR (Equivalence)

Outputs 1 when inputs are the same. The opposite of XOR.

Truth Table:
┌───┬───┬───┐
│ A │ B │ Q │   Used for:
├───┼───┼───┤   • Equality comparison
│ 0 │ 0 │ 1 │   • Pattern matching
│ 0 │ 1 │ 0 │   • Lock-step verification
│ 1 │ 0 │ 0 │
│ 1 │ 1 │ 1 │
└───┴───┴───┘

Combinational Circuits

Combinational circuits are circuits where the output depends only on the current inputs—they have no memory. The same inputs always produce the same outputs.

Multiplexers (MUX) & Demultiplexers

A multiplexer is like a data switch—it selects one of many inputs to pass through to a single output.

2:1 Multiplexer (MUX)

      I₀ ─────┐
              ├─────► Q
      I₁ ─────┘
              │
      S ──────┘ (Select line)

If S = 0, Q = I₀
If S = 1, Q = I₁

Boolean: Q = (I₀ · S̄) + (I₁ · S)

MUX Analogy: TV Input Selector

Your TV's input selector is a real-world multiplexer:

  • Inputs: HDMI 1, HDMI 2, USB, Antenna
  • Select: The input button on your remote
  • Output: What appears on screen

Only one input passes through at a time based on your selection!

4:1 Multiplexer

4:1 MUX (4 inputs, 2 select lines)

      I₀ ─────┐
      I₁ ─────┤
              ├─────► Q
      I₂ ─────┤
      I₃ ─────┘
              │
      S₁S₀ ───┘

Select  │ Output
────────┼───────
S₁S₀=00 │ Q = I₀
S₁S₀=01 │ Q = I₁
S₁S₀=10 │ Q = I₂
S₁S₀=11 │ Q = I₃
CPUs Use MUXes Everywhere! Inside a CPU, multiplexers route data between registers, select ALU operations, and control what goes on the data bus. A modern CPU has millions of MUXes!

Adders: The Heart of Arithmetic

Half Adder

Adds two single bits, producing a sum and carry:

Half Adder Circuit:
                              Truth Table:
    A ────┬────[XOR]─── Sum   ┌───┬───┬─────┬───────┐
          │                   │ A │ B │ Sum │ Carry │
          └───┬               ├───┼───┼─────┼───────┤
    B ────────┼──[AND]─ Carry │ 0 │ 0 │  0  │   0   │
          ────┘               │ 0 │ 1 │  1  │   0   │
                              │ 1 │ 0 │  1  │   0   │
    Sum = A ⊕ B               │ 1 │ 1 │  0  │   1   │
    Carry = A · B             └───┴───┴─────┴───────┘

Full Adder

Adds two bits plus a carry-in. Essential for multi-bit addition!

Full Adder (3 inputs: A, B, Cᵢₙ)

Truth Table:
┌───┬───┬─────┬─────┬───────┐
│ A │ B │ Cᵢₙ │ Sum │ Cₒᵤₜ  │
├───┼───┼─────┼─────┼───────┤
│ 0 │ 0 │  0  │  0  │   0   │
│ 0 │ 0 │  1  │  1  │   0   │
│ 0 │ 1 │  0  │  1  │   0   │
│ 0 │ 1 │  1  │  0  │   1   │
│ 1 │ 0 │  0  │  1  │   0   │
│ 1 │ 0 │  1  │  0  │   1   │
│ 1 │ 1 │  0  │  0  │   1   │
│ 1 │ 1 │  1  │  1  │   1   │
└───┴───┴─────┴─────┴───────┘

Sum  = A ⊕ B ⊕ Cᵢₙ
Cₒᵤₜ = (A · B) + (Cᵢₙ · (A ⊕ B))

Ripple-Carry Adder (4-bit)

Chain full adders to add multi-bit numbers:

    A₃B₃    A₂B₂    A₁B₁    A₀B₀
     │ │     │ │     │ │     │ │
     ▼ ▼     ▼ ▼     ▼ ▼     ▼ ▼
   ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
Cₒᵤₜ│ FA │←│ FA │←│ FA │←│ FA │← 0 (initial carry)
   └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘
      │       │       │       │
      S₃      S₂      S₁      S₀

Example: 0101 + 0011 = 1000 (5 + 3 = 8)
The Ripple Problem: Carries must "ripple" through each adder—the last sum bit can't be computed until all previous carries are known. For a 64-bit adder, this creates significant delay. Modern CPUs use carry-lookahead adders to compute carries in parallel!

Arithmetic Logic Unit (ALU)

The ALU is the computational heart of the CPU. It performs arithmetic (add, subtract, multiply) and logic (AND, OR, XOR, compare) operations.

Simple 1-bit ALU:

           ┌─────────────────────────────────────┐
    A ─────┤                                     │
           │    ┌─────────┐                      │
    B ─────┤────│  AND    ├───┐                  │
           │    └─────────┘   │                  │
           │    ┌─────────┐   │    ┌──────┐      │
           ├────│   OR    ├───┼───►│ MUX  │──────┼──► Result
           │    └─────────┘   │    │ 4:1  │      │
           │    ┌─────────┐   │    └──┬───┘      │
           ├────│  Adder  ├───┘       │          │
           │    └─────────┘           │          │
           │    ┌─────────┐           │          │
           └────│   SLT   ├───────────┘          │
                └─────────┘                      │
                                                 │
    Operation ─────────────────────────────────►─┘
    (2-bit select)

    Operation │ Output
    ──────────┼────────
       00     │ A AND B
       01     │ A OR B
       10     │ A + B
       11     │ SLT (set less than)

Real ALU: MIPS 32-bit

Hardware Example

A MIPS CPU's ALU supports these operations:

ALU ControlOperation
0000AND
0001OR
0010ADD
0110SUBTRACT
0111Set Less Than
1100NOR

Sequential Circuits: Adding Memory

Sequential circuits have outputs that depend on both current inputs and past inputs (state). They can "remember" information—this is how computers store data!

Flip-Flops & Latches

SR Latch (Set-Reset)

The simplest memory element—it can store 1 bit:

SR Latch (using NOR gates):

    S ────[NOR]─┬─── Q
           │    │
           └────┼───┐
                │   │
    R ────[NOR]─┴───┴── Q̄

    S │ R │ Q (next)
    ──┼───┼──────────
    0 │ 0 │ No change (hold)
    0 │ 1 │ 0 (reset)
    1 │ 0 │ 1 (set)
    1 │ 1 │ Invalid! (forbidden)

D Flip-Flop (Data Flip-Flop)

The most common memory element in CPUs—stores whatever is on D when the clock rises:

D Flip-Flop:

         ┌────────┐
    D ───┤D     Q ├─── Q
         │       │
   CLK ─►│>      │
         │     Q̄ ├─── Q̄
         └────────┘

On rising edge of CLK:
    Q takes the value of D
    (Q = D)

Between clock edges:
    Q holds its value
The Clock is the CPU's Heartbeat: The clock signal synchronizes all flip-flops in the CPU. At each clock tick, data moves from one stage to the next. A 4 GHz CPU has 4 billion clock ticks per second!

Registers: Fast Storage Inside the CPU

A register is a group of flip-flops that store multiple bits:

4-bit Register (using D flip-flops):

            D₃    D₂    D₁    D₀  (4-bit input)
            │     │     │     │
            ▼     ▼     ▼     ▼
          ┌───┐ ┌───┐ ┌───┐ ┌───┐
          │ D │ │ D │ │ D │ │ D │
    CLK ─►│ F │►│ F │►│ F │►│ F │
          │ F │ │ F │ │ F │ │ F │
          └─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
            │     │     │     │
            Q₃    Q₂    Q₁    Q₀  (4-bit output)

On each clock edge: Q[3:0] = D[3:0]

CPU Registers in Practice

x86-64 Architecture

Modern x86-64 CPUs have these general-purpose registers:

64-bit registers: RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, R8-R15

Each 64-bit register can be accessed in parts:
┌─────────────────────────────────────────────────────────────┐
│                          RAX (64-bit)                        │
├─────────────────────────────┬───────────────────────────────┤
│         (high 32 bits)      │           EAX (32-bit)        │
│                             ├───────────────┬───────────────┤
│                             │               │   AX (16-bit) │
│                             │               ├───────┬───────┤
│                             │               │AH(8b) │AL(8b) │
└─────────────────────────────┴───────────────┴───────┴───────┘

A register file contains all these registers and allows reading 2 and writing 1 simultaneously!

Counters

A counter is a register that increments its value on each clock cycle:

4-bit Binary Counter:

Clock  │ Q₃Q₂Q₁Q₀ │ Decimal
───────┼──────────┼─────────
  0    │  0000    │    0
  1    │  0001    │    1
  2    │  0010    │    2
  3    │  0011    │    3
  4    │  0100    │    4
  ...  │   ...    │   ...
  15   │  1111    │   15
  16   │  0000    │    0  (wraps around)

Used in: Program Counter (PC), timer circuits, state machines

Datapath: The Data Highway

The datapath is the collection of functional units (ALUs, registers, memory) and the buses that connect them. It's the "data highway" inside the CPU.

Simplified MIPS Datapath:

┌──────────────────────────────────────────────────────────────────────┐
│                                                                       │
│    ┌─────┐    ┌──────────┐    ┌─────┐    ┌─────┐    ┌──────────┐    │
│    │     │    │ Register │    │     │    │     │    │   Data   │    │
│ ──►│ PC  │───►│   File   │───►│ ALU │───►│MUX  │───►│  Memory  │    │
│    │     │    │          │    │     │    │     │    │          │    │
│    └──┬──┘    └────┬─────┘    └──┬──┘    └─────┘    └────┬─────┘    │
│       │            │             │                        │          │
│       │            │             │                        │          │
│       ▼            │             │                        │          │
│  ┌─────────┐       │             │                        │          │
│  │ Instruc │       │             │                        │          │
│  │ Memory  │───────┴─────────────┴────────────────────────┘          │
│  │         │                                                          │
│  └─────────┘                                                          │
│                                                                       │
│   PC = Program Counter                                                │
│   ALU = Arithmetic Logic Unit                                         │
│   MUX = Multiplexer (data selector)                                   │
└──────────────────────────────────────────────────────────────────────┘

Datapath for R-type Instructions

For instructions like add $t0, $t1, $t2:

1. Fetch instruction from memory at address PC
2. Read registers $t1 and $t2 from register file
3. ALU adds the two values
4. Result written back to register $t0
5. PC incremented to next instruction

Clock Cycle:
─────┬─────────┬─────────┬─────────┬─────────┬─────
     │ FETCH   │ DECODE  │ EXECUTE │ WRITE   │
     │         │ READ    │  (ALU)  │ BACK    │
─────┴─────────┴─────────┴─────────┴─────────┴─────

Microarchitecture: Putting It All Together

Microarchitecture is the specific implementation of an ISA (Instruction Set Architecture). Different microarchitectures can execute the same instructions in different ways.

ISA vs Microarchitecture:
  • ISA = WHAT instructions the CPU understands (x86-64, ARM, RISC-V)
  • Microarchitecture = HOW the CPU executes those instructions

Intel Skylake and AMD Zen both run x86-64 (same ISA) but have different microarchitectures!

Single-Cycle vs Multi-Cycle vs Pipelined

Design Clock Cycles/Instruction Clock Speed Throughput
Single-Cycle 1 Slow (must fit longest instruction) Low
Multi-Cycle 3-5 (varies) Faster clock Medium
Pipelined 1 (after pipeline fills) Fast clock High
Pipelining: Overlapping Instruction Execution

Without pipelining (single-cycle):
Instr 1: ████████████████████
Instr 2:                      ████████████████████
Instr 3:                                            ████████████████████

With 5-stage pipelining:
Instr 1: IF─┬─ID─┬─EX─┬─MEM┬─WB
Instr 2:    IF─┬─ID─┬─EX─┬─MEM┬─WB
Instr 3:       IF─┬─ID─┬─EX─┬─MEM┬─WB
Instr 4:          IF─┬─ID─┬─EX─┬─MEM┬─WB
Instr 5:             IF─┬─ID─┬─EX─┬─MEM┬─WB

IF = Instruction Fetch, ID = Decode, EX = Execute, MEM = Memory, WB = Write Back

5 instructions complete in 9 cycles instead of 25!

Exercises

Practice Exercises

Hands-On
  1. Gate Practice: Build a 2:1 MUX using only NAND gates
  2. Truth Table: Create the truth table for: Q = (A AND B) XOR C
  3. Adder Design: Trace through a 4-bit ripple-carry adder computing 0110 + 0101
  4. ALU Operation: What ALU control signals would you use for A - B?
  5. Pipeline Analysis: If a 5-stage pipeline has a 1ns clock period, how long to execute 100 instructions?
# Simulate a simple ALU in Python
def simple_alu(a, b, operation):
    """
    Simple 4-operation ALU
    operation: 0=AND, 1=OR, 2=ADD, 3=SUB
    """
    if operation == 0:
        return a & b
    elif operation == 1:
        return a | b
    elif operation == 2:
        return a + b
    elif operation == 3:
        return a - b
    
# Test it
print(f"5 AND 3 = {simple_alu(5, 3, 0)}")  # 1
print(f"5 OR 3 = {simple_alu(5, 3, 1)}")   # 7
print(f"5 + 3 = {simple_alu(5, 3, 2)}")    # 8
print(f"5 - 3 = {simple_alu(5, 3, 3)}")    # 2

Conclusion & Key Takeaways

You now understand how transistors combine to form gates, gates combine to form circuits, and circuits combine to form a working CPU.

What You've Learned:
  • Logic Gates — AND, OR, NOT, NAND, NOR, XOR, XNOR and their truth tables
  • Universal Gates — NAND/NOR can build any other gate
  • Combinational Circuits — MUXes, half/full adders, ripple-carry adders, ALU
  • Sequential Circuits — Latches, flip-flops, registers, counters
  • Datapath — How components connect to form the CPU's data highway
  • Microarchitecture — Single-cycle vs pipelined execution

Next in the Series

Continue to Part 3: Instruction Set Architecture (ISA) to learn about RISC vs CISC, instruction formats, and addressing modes!