Introduction
The Instruction Set Architecture (ISA) defines the interface between software and hardware—it specifies what instructions a processor can execute and how those instructions are encoded.
Series Context: This is Part 3 of 24 in the Computer Architecture & Operating Systems Mastery series. Building on digital logic foundations, we now explore how processors understand and execute instructions.
1
Part 1: Foundations of Computer Systems
System overview, architectures, OS role
2
Digital Logic & CPU Building Blocks
Gates, registers, datapath, microarchitecture
3
Instruction Set Architecture (ISA)
RISC vs CISC, instruction formats, addressing
You Are Here
4
Assembly Language & Machine Code
Registers, stack, calling conventions
5
Assemblers, Linkers & Loaders
Object files, ELF, dynamic linking
6
Compilers & Program Translation
Lexing, parsing, code generation
7
CPU Execution & Pipelining
Fetch-decode-execute, hazards, prediction
8
OS Architecture & Kernel Design
Monolithic, microkernel, system calls
9
Processes & Program Execution
Process lifecycle, PCB, fork/exec
10
Threads & Concurrency
Threading models, pthreads, race conditions
11
CPU Scheduling Algorithms
FCFS, RR, CFS, real-time scheduling
12
Synchronization & Coordination
Locks, semaphores, classic problems
13
Deadlocks & Prevention
Coffman conditions, Banker's algorithm
14
Memory Hierarchy & Cache
L1/L2/L3, cache coherence, NUMA
15
Memory Management Fundamentals
Address spaces, fragmentation, allocation
16
Virtual Memory & Paging
Page tables, TLB, demand paging
17
File Systems & Storage
Inodes, journaling, ext4, NTFS
18
I/O Systems & Device Drivers
Interrupts, DMA, disk scheduling
19
Multiprocessor Systems
SMP, NUMA, cache coherence
20
OS Security & Protection
Privilege levels, ASLR, sandboxing
21
Virtualization & Containers
Hypervisors, namespaces, cgroups
22
Advanced Kernel Internals
Linux subsystems, kernel debugging
23
Case Studies
Linux vs Windows vs macOS
24
Capstone Projects
Shell, thread pool, paging simulator
An ISA is like a contract between hardware and software. It specifies what the processor can do, but not how it's implemented. This abstraction allows software to run on different microarchitectures that share the same ISA.
The ISA as a Contract
Key Concept
Think of the ISA like a restaurant menu:
- ISA (Menu) — Lists what dishes (instructions) are available
- Microarchitecture (Kitchen) — How dishes are actually prepared
- Software (Customer) — Orders from the menu without knowing kitchen details
Intel Core i9 and AMD Ryzen both implement x86-64 (same menu), but have very different "kitchens" inside!
What Does an ISA Define?
| Component | What It Specifies | Example |
| Instructions | Operations the CPU can perform | ADD, SUB, MOV, JMP, CALL |
| Registers | Number, size, and purpose of registers | x86-64: 16 general-purpose 64-bit registers |
| Data Types | Supported data formats | 8/16/32/64-bit integers, floats |
| Addressing Modes | How operands are accessed | Immediate, register, direct, indexed |
| Memory Model | How memory is organized and accessed | Byte-addressable, little-endian |
| I/O Model | How devices are accessed | Memory-mapped I/O vs port I/O |
RISC vs CISC
The two major ISA design philosophies are RISC (Reduced Instruction Set Computer) and CISC (Complex Instruction Set Computer). They represent fundamentally different approaches to instruction design.
RISC Philosophy
RISC follows the principle: "Keep instructions simple, let the compiler optimize."
RISC Core Principles:
- Simple, uniform instructions that execute in one cycle
- Fixed instruction length (typically 32 bits)
- Load-store architecture (only load/store access memory)
- Large register file (32+ general-purpose registers)
- Hardwired control unit (faster but less flexible)
RISC Example: ARM Assembly
Adding two numbers from memory:
; Load first number from memory into R1
LDR R1, [R0] ; R0 points to first number
; Load second number into R2
LDR R2, [R0, #4] ; Next memory location
; Add them
ADD R3, R1, R2 ; R3 = R1 + R2
; Store result back to memory
STR R3, [R0, #8] ; Store at third location
Four simple instructions, each does ONE thing!
RISC Examples in the Real World
- ARM — Smartphones (iPhone, Android), Apple M1/M2/M3 chips, Raspberry Pi
- RISC-V — Open-source ISA, gaining traction in IoT and academia
- MIPS — Network routers, PlayStation 1/2, Nintendo 64
- PowerPC — PlayStation 3, Xbox 360, older Macs
CISC Philosophy
CISC follows the principle: "Build powerful instructions, reduce code size."
CISC Core Principles:
- Complex instructions that do multiple operations
- Variable instruction length (1-15+ bytes for x86)
- Instructions can directly access memory
- Fewer registers (x86 originally had just 8)
- Microprogrammed control unit (flexible but potentially slower)
CISC Example: x86 Assembly
Same operation—adding two numbers:
; Single instruction does load + add + store!
ADD DWORD PTR [result], [number1] ; Not actually valid x86, but shows the idea
; Real x86 code:
MOV EAX, [number1] ; Load first number
ADD EAX, [number2] ; Add second directly from memory!
MOV [result], EAX ; Store result
Three instructions, but ADD reads from memory directly!
Modern Convergence
Today, the RISC/CISC distinction has blurred significantly:
Modern CPU Reality:
┌────────────────────────────────────────────────────────────────┐
│ Modern x86 CPU (Intel/AMD) │
│ │
│ ┌───────────────┐ ┌──────────────────────────────────┐ │
│ │ x86 ISA │ │ RISC-like Core │ │
│ │ (CISC) │ ──► │ • Fixed-length micro-ops │ │
│ │ Frontend │ │ • Out-of-order execution │ │
│ └───────────────┘ │ • Large register file │ │
│ │ • Superscalar pipeline │ │
│ Complex x86 └──────────────────────────────────┘ │
│ instructions get │
│ decoded into simple │
│ RISC-like micro-ops │
└────────────────────────────────────────────────────────────────┘
Result: x86 ISA compatibility + RISC-like internal performance!
Key Insight: Modern Intel and AMD processors are essentially RISC internally—they translate x86 CISC instructions into simple micro-operations (μops) that execute on a RISC-like core. This gives the best of both worlds: x86 compatibility plus high performance!
RISC vs CISC Comparison
| Aspect | RISC | CISC |
| Instruction Count | ~50-150 instructions | ~1000+ instructions |
| Instruction Length | Fixed (32 bits typical) | Variable (1-15 bytes) |
| Memory Access | Only via LOAD/STORE | Most instructions can access memory |
| Registers | 32+ general-purpose | 8-16 general-purpose |
| Cycles/Instruction | ~1 (design goal) | Variable (1-many) |
| Code Density | Lower (more instructions) | Higher (fewer, powerful instructions) |
| Compiler Complexity | More complex (must optimize) | Less complex |
| Power Efficiency | Generally better | Generally worse |
| Examples | ARM, RISC-V, MIPS | x86, x86-64 |
Instruction Formats
Instructions must be encoded into binary so the CPU can decode and execute them. The instruction format defines how bits are arranged to represent opcode, operands, and addressing information.
Fixed-Length Instructions (RISC)
RISC architectures typically use fixed-length instructions, making decoding simpler and faster.
MIPS 32-bit Instruction Formats:
R-Type (Register): Used for arithmetic/logic between registers
┌────────┬───────┬───────┬───────┬───────┬────────┐
│ opcode │ rs │ rt │ rd │ shamt │ funct │
│ 6 bits │ 5 bits│ 5 bits│ 5 bits│ 5 bits│ 6 bits │
└────────┴───────┴───────┴───────┴───────┴────────┘
0 src1 src2 dest shift function
Example: ADD $t0, $t1, $t2
opcode=0, rs=$t1, rt=$t2, rd=$t0, shamt=0, funct=32
I-Type (Immediate): Used for immediate values and loads/stores
┌────────┬───────┬───────┬──────────────────────┐
│ opcode │ rs │ rt │ immediate │
│ 6 bits │ 5 bits│ 5 bits│ 16 bits │
└────────┴───────┴───────┴──────────────────────┘
Example: ADDI $t0, $t1, 100
opcode=8, rs=$t1, rt=$t0, imm=100
J-Type (Jump): Used for jumps
┌────────┬─────────────────────────────────────┐
│ opcode │ target address │
│ 6 bits │ 26 bits │
└────────┴─────────────────────────────────────┘
Example: J 1000
opcode=2, target=1000
Why Fixed Length Matters: With fixed-length instructions, the CPU knows exactly where each instruction begins. It can fetch and decode multiple instructions in parallel without first figuring out their lengths!
Variable-Length Instructions (CISC)
x86 uses variable-length instructions from 1 to 15 bytes, prioritizing code density over decode simplicity.
x86 Instruction Format (Simplified):
┌──────────┬────────┬─────────┬─────────┬──────────────┬─────────────┐
│ Prefixes │ Opcode │ ModR/M │ SIB │ Displacement │ Immediate │
│ 0-4 bytes│1-3 byte│ 0-1 byte│0-1 byte │ 0-4 bytes │ 0-4 bytes │
└──────────┴────────┴─────────┴─────────┴──────────────┴─────────────┘
Optional Required Optional Optional Optional Optional
Examples of x86 instruction lengths:
NOP ; 1 byte (0x90)
PUSH EAX ; 1 byte (0x50)
MOV EAX, EBX ; 2 bytes (0x89 0xD8)
ADD EAX, 1 ; 3 bytes (0x83 0xC0 0x01)
MOV EAX, [EBX+ECX*4+8] ; 4 bytes
MOV EAX, 0x12345678 ; 5 bytes (0xB8 + 4-byte immediate)
ADD QWORD PTR [RBX+RCX*8+0x12345678], 0x87654321 ; 12+ bytes
The ModR/M Byte
x86 Deep Dive
The ModR/M byte specifies operand locations:
ModR/M byte: ┌───┬─────┬─────┐
│Mod│ Reg │ R/M │
│2b │ 3b │ 3b │
└───┴─────┴─────┘
Mod = 00: Memory operand, no displacement
Mod = 01: Memory + 8-bit displacement
Mod = 10: Memory + 32-bit displacement
Mod = 11: Register operand
Reg = Which register is used
R/M = Which register or memory formula
Encoding Schemes
ARM A64 (AArch64) 32-bit Fixed Format:
Data Processing (Register):
┌──────┬───┬─────────┬──────┬──────┬───────┬──────┐
│ sf │ op│ S │ shift│ Rm │ imm6 │ Rn │ Rd │
│ 1 bit│ 2 │ 1 bit │ 2 bit│5 bits│ 6 bits│5 bits│5 bits│
└──────┴───┴─────────┴──────┴──────┴───────┴──────┴──────┘
Load/Store:
┌────────────┬────────┬────────────┬──────┬──────┐
│ size/opc │ V │ opc2 │ Rn │ Rt │
│ 4 bits │ 1 bit │ variable │5 bits│5 bits│
└────────────┴────────┴────────────┴──────┴──────┘
Branch:
┌─────────┬─────────────────────────────────────┐
│ opcode │ 26-bit offset │
│ 6 bits │ │
└─────────┴─────────────────────────────────────┘
Addressing Modes
Addressing modes specify how the CPU finds the operands for an instruction. Different modes provide flexibility for accessing data in registers, memory, or as constants.
The operand value is encoded directly in the instruction itself.
Immediate Addressing:
MOV EAX, 42 ; Put the value 42 into EAX
; The 42 is stored IN the instruction itself
Instruction encoding: B8 2A 00 00 00
│ └──────────┘
│ └── 42 in little-endian (0x0000002A)
└── Opcode for MOV EAX, imm32
Advantages:
✓ Fast—no memory access needed for the operand
✓ Simple to decode
Disadvantages:
✗ Limited by instruction size (can't store large values)
✗ Value is fixed at compile time
Register Addressing
The operand is in a CPU register.
Register Addressing:
ADD EAX, EBX ; EAX = EAX + EBX
; Both operands are in registers
CPU Operation:
1. Read value from EAX
2. Read value from EBX
3. Add them
4. Write result to EAX
Advantages:
✓ Fastest access (registers are on-chip)
✓ No memory access needed
Disadvantages:
✗ Limited registers available
✗ Must load data from memory first
Memory Addressing
x86 provides rich memory addressing modes:
Memory Addressing Modes in x86:
1. Direct (Absolute):
MOV EAX, [0x1000] ; Load from address 0x1000
2. Register Indirect:
MOV EAX, [EBX] ; Load from address in EBX
3. Base + Displacement:
MOV EAX, [EBX + 8] ; Load from (EBX + 8)
; Great for struct fields!
4. Base + Index:
MOV EAX, [EBX + ECX] ; Load from (EBX + ECX)
5. Base + Index + Displacement:
MOV EAX, [EBX + ECX + 8] ; Load from (EBX + ECX + 8)
6. Scaled Index:
MOV EAX, [EBX + ECX*4] ; Load from (EBX + ECX×4)
; Perfect for array access!
; EBX = array base, ECX = index
; ×4 because each int is 4 bytes
7. Full Form (SIB addressing):
MOV EAX, [EBX + ECX*4 + 100] ; base + index×scale + disp
Formula: Address = Base + (Index × Scale) + Displacement
Scale can be: 1, 2, 4, or 8
Array Access in Action
Practical Example
Accessing array[i] where array is at 0x1000:
C code: int value = array[i];
// If array base is in EBX, index i is in ECX
// Each int is 4 bytes
MOV EAX, [EBX + ECX*4] ; EAX = array[i]
; Address = EBX + ECX × 4
Example with array at 0x1000, i = 3:
Address = 0x1000 + 3 × 4 = 0x100C
Memory: 0x1000 0x1004 0x1008 0x100C 0x1010
│ │ │ │ │
arr[0] arr[1] arr[2] arr[3] arr[4]
└── We read this!
x86 vs ARM
x86 and ARM are the two dominant ISAs today. Let's compare them in detail.
x86/x86-64 Architecture
x86-64 General Purpose Registers:
64-bit 32-bit 16-bit 8-bit high/low
┌─────────────────────────────────────────────┐
│ RAX │ EAX │ AX │ AH │ AL │ Accumulator
│ RBX │ EBX │ BX │ BH │ BL │ Base
│ RCX │ ECX │ CX │ CH │ CL │ Counter
│ RDX │ EDX │ DX │ DH │ DL │ Data
│ RSI │ ESI │ SI │ - │ SIL │ Source Index
│ RDI │ EDI │ DI │ - │ DIL │ Destination Index
│ RBP │ EBP │ BP │ - │ BPL │ Base Pointer
│ RSP │ ESP │ SP │ - │ SPL │ Stack Pointer
│ R8-R15 │ R8D-.. │ R8W-.. │ - │R8B-.. │ Extended (64-bit only)
└─────────────────────────────────────────────┘
Special registers:
RIP - Instruction Pointer
RFLAGS - Status flags (zero, carry, overflow, etc.)
Segment registers (legacy): CS, DS, ES, FS, GS, SS
x86 Legacy: x86-64 maintains backward compatibility to 1978! It can run 16-bit code from the original 8086, 32-bit code from the 386, and modern 64-bit code. This compatibility comes at a cost: complex instruction decoding and wasted transistors.
ARM Architecture
ARM64 (AArch64) Registers:
┌─────────────────────────────────────────────────┐
│ General Purpose: X0-X30 (64-bit) │
│ W0-W30 (32-bit lower half) │
│ │
│ X0-X7 : Arguments and return values │
│ X8 : Indirect result location │
│ X9-X15 : Temporary (caller-saved) │
│ X16-X17 : Intra-procedure-call scratch │
│ X18 : Platform register │
│ X19-X28 : Callee-saved registers │
│ X29 : Frame pointer (FP) │
│ X30 : Link register (LR) - return address │
│ │
│ Special: │
│ SP : Stack Pointer (separate register) │
│ PC : Program Counter (not directly usable)│
│ XZR/WZR : Zero register (reads as 0) │
└─────────────────────────────────────────────────┘
SIMD/FP Registers:
V0-V31: 128-bit vector registers
Can be accessed as: Bn (8-bit), Hn (16-bit), Sn (32-bit float),
Dn (64-bit double), Qn (128-bit quad)
Head-to-Head Comparison
| Feature | x86-64 | ARM64 |
| Design | CISC (with RISC-like internals) | RISC |
| Instruction Length | Variable (1-15 bytes) | Fixed (32 bits) |
| Registers | 16 GP + 16 SIMD | 31 GP + 32 SIMD |
| Endianness | Little-endian | Bi-endian (usually LE) |
| Memory Access | Many instructions can access memory | Only LOAD/STORE |
| Conditional Exec | Via jumps | Predicated instructions + jumps |
| Power Efficiency | Lower | Higher |
| Market | Desktops, servers, laptops | Mobile, embedded, Apple Silicon |
Same Algorithm, Different ISAs:
// C code: int sum = a + b + c;
x86-64:
mov eax, [a]
add eax, [b] ; Can add from memory directly!
add eax, [c]
mov [sum], eax
ARM64:
ldr w0, [x1] ; Load a
ldr w1, [x2] ; Load b
ldr w2, [x3] ; Load c
add w0, w0, w1 ; a + b
add w0, w0, w2 ; + c
str w0, [x4] ; Store sum
x86: 4 instructions, 3 memory accesses embedded
ARM: 6 instructions, 3 explicit loads + 1 store
Apple Silicon: ARM Conquers Laptops
Case Study
In 2020, Apple proved ARM can compete with x86 for high-performance computing:
- M1/M2/M3 — ARM-based chips matching or exceeding Intel/AMD
- Rosetta 2 — Translates x86 apps to ARM with ~80% native performance
- Power Efficiency — MacBook Air with 18+ hour battery life
- Unified Memory — CPU and GPU share high-bandwidth memory
This shifted the narrative: ARM is no longer "just for phones"!
Exercises
Practice Exercises
Hands-On
- RISC vs CISC: Why might a RISC processor have better power efficiency?
- Instruction Decode: Given MIPS instruction 0x012A4020, decode it (hint: it's R-type, funct=32 is ADD)
- Addressing Mode: Write the x86 instruction to load array[i*8 + 4] where array is in RBX and i is in RCX
- Code Density: Why does x86 generally have higher code density than ARM?
- Research: What is RISC-V and why is it gaining attention?
# Decode a MIPS R-type instruction
def decode_mips_r_type(instruction):
"""
Decode a 32-bit MIPS R-type instruction
Format: opcode(6) rs(5) rt(5) rd(5) shamt(5) funct(6)
"""
opcode = (instruction >> 26) & 0x3F
rs = (instruction >> 21) & 0x1F
rt = (instruction >> 16) & 0x1F
rd = (instruction >> 11) & 0x1F
shamt = (instruction >> 6) & 0x1F
funct = instruction & 0x3F
# Function codes for common operations
funct_names = {32: 'ADD', 34: 'SUB', 36: 'AND', 37: 'OR', 42: 'SLT'}
print(f"Instruction: 0x{instruction:08X}")
print(f"opcode={opcode}, rs=${rs}, rt=${rt}, rd=${rd}, shamt={shamt}, funct={funct}")
if opcode == 0 and funct in funct_names:
print(f"Decoded: {funct_names[funct]} ${rd}, ${rs}, ${rt}")
# Test with ADD $t0, $t1, $t2 (0x012A4020)
decode_mips_r_type(0x012A4020)
Conclusion & Key Takeaways
The ISA is the crucial interface between hardware and software. Understanding ISA design helps you write better low-level code and appreciate the trade-offs in processor design.
What You've Learned:
- ISA Role — The contract between software and hardware
- RISC vs CISC — Simple vs complex instruction philosophies
- Modern Reality — x86 is CISC outside, RISC inside
- Instruction Formats — Fixed (RISC) vs variable (CISC) length
- Addressing Modes — Immediate, register, and memory addressing
- x86 vs ARM — Trade-offs between compatibility and efficiency
Continue the Computer Architecture & OS Series
Part 2: Digital Logic & CPU Building Blocks
Logic gates, ALU, registers, datapath, and microarchitecture fundamentals.
Read Article
Part 4: Assembly Language & Machine Code
Registers, stack operations, calling conventions, and assembly examples.
Read Article
Part 5: Assemblers, Linkers & Loaders
Object files, ELF format, static and dynamic linking.
Read Article