Introduction: The Memory Problem
Phase 6 Goals: By the end of this phase, your kernel will have virtual memory via paging, a physical frame allocator, and a working heap with malloc() and free(). Your kernel can now dynamically allocate memory at runtime.
Phase 0: Orientation & Big Picture
OS fundamentals, kernel architectures, learning path
Phase 1: How a Computer Starts
BIOS/UEFI, boot sequence, dev environment
Phase 2: Real Mode - First Steps
Real mode, bootloader, BIOS interrupts
Phase 3: Entering Protected Mode
GDT, 32-bit mode, C code execution
Phase 4: Display, Input & Output
VGA text mode, keyboard handling
Phase 5: Interrupts & CPU Control
IDT, ISRs, PIC programming
7
Phase 6: Memory Management
Paging, virtual memory, heap allocator
You Are Here
8
Phase 7: Disk Access & Filesystems
Block devices, FAT, VFS layer
9
Phase 8: Processes & User Mode
Task switching, system calls, user space
10
Phase 9: ELF Loading & Executables
ELF format, program loading
11
Phase 10: Standard Library & Shell
C library, command-line shell
12
Phase 11: 64-Bit Long Mode
x86-64, 64-bit paging, modern architecture
13
Phase 12: Modern Booting with UEFI
UEFI boot services, memory maps
14
Phase 13: Graphics & GUI Systems
Framebuffer, windowing, drawing
15
Phase 14: Advanced Input & Timing
Mouse, high-precision timers
16
Phase 15: Hardware Discovery & Drivers
PCI, device drivers, NVMe
17
Phase 16: Performance & Optimization
Caching, scheduler tuning
18
Phase 17: Stability, Security & Finishing
Debugging, hardening, completion
Until now, your kernel has been using memory in a simple way: hardcoded addresses, no protection, and no dynamic allocation. This approach doesn't scale—what happens when you want to run multiple programs, each thinking they own all of memory?
THE MEMORY PROBLEM
═══════════════════════════════════════════════════════════════
Without Virtual Memory:
┌───────────────────────────────────────────────────────────────┐
│ 0x00000000 ┌─────────────────────┐ │
│ │ Your Kernel │ │
│ 0x00100000 ├─────────────────────┤ │
│ │ Program A │ ← Hardcoded to 0x100000 │
│ 0x00200000 ├─────────────────────┤ │
│ │ Program B │ ← Must avoid A's memory │
│ └─────────────────────┘ │
│ │
│ ❌ Programs must know physical addresses │
│ ❌ Programs can read/write each other's memory │
│ ❌ What if A wants more memory? Must relocate B! │
│ ❌ What if memory is fragmented? │
└───────────────────────────────────────────────────────────────┘
With Virtual Memory:
┌───────────────────────────────────────────────────────────────┐
│ Program A's View: Program B's View: │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ 0x00000000 │ │ 0x00000000 │ │
│ │ My Code │ │ My Code │ │
│ │ 0x00400000 │ │ 0x00400000 │ │
│ │ My Data │ │ My Data │ │
│ └─────────────────┘ └─────────────────┘ │
│ Both think they start Both are ISOLATED │
│ at address 0! from each other! │
│ │
│ ✅ Programs use virtual addresses (they don't know physical) │
│ ✅ Each program has its own isolated address space │
│ ✅ Memory can be allocated dynamically │
│ ✅ Physical memory can be scattered (no fragmentation issue) │
└───────────────────────────────────────────────────────────────┘
Key Insight: Virtual memory creates the illusion that each process has its own private memory space, even though physical RAM is shared. This is the foundation of memory protection and modern multitasking.
Why Virtual Memory?
BENEFITS OF VIRTUAL MEMORY
═══════════════════════════════════════════════════════════════
1. ISOLATION & PROTECTION
┌────────────────────────────────────────────────────────────┐
│ Process A │ Kernel │ Process B │
│ (User Mode) │ (Ring 0) │ (User Mode) │
│ ┌───┐ │ ┌───┐ │ ┌───┐ │
│ │ P │ │ │ K │ │ │ P │ │
│ │ A │ │ │ E │ │ │ B │ │
│ │ G │ │ │ R │ │ │ G │ │
│ │ E │ │ │ N │ │ │ E │ │
│ │ S │ │ │ E │ │ │ S │ │
│ └───┘ │ │ L │ │ └───┘ │
│ × │ └───┘ │ × │
│ Can't │ ✓ │ Can't │
│ access B │ Can access │ access A │
│ or Kernel │ everything│ or Kernel │
└────────────────────────────────────────────────────────────┘
2. MEMORY OVERCOMMIT
Physical RAM: 4 GB
But you can run:
- Chrome (2 GB virtual)
- VS Code (1 GB virtual)
- Your OS (1 GB virtual)
- Several other apps...
How? Most memory isn't actively used at once!
Demand paging loads only what's needed.
3. SIMPLER PROGRAM LOADING
Every program links to run at address 0x00400000
Virtual memory maps it to whatever physical RAM is free!
4. MEMORY-MAPPED FILES
File contents appear in memory without explicit read() calls
Changes automatically write back to disk
Memory Architecture Overview
Here's the complete memory management stack we'll build:
MEMORY MANAGEMENT LAYERS
═══════════════════════════════════════════════════════════════
Layer 4: User Applications
└── malloc(), free(), new, delete
↓ system calls
Layer 3: Kernel Heap (kmalloc/kfree)
└── Variable-sized allocations for kernel use
↓ uses
Layer 2: Virtual Memory Manager
└── Map virtual pages to physical frames
└── Handle page faults
└── Manage address spaces per process
↓ uses
Layer 1: Paging Unit (MMU Hardware)
└── Page Directory → Page Tables → Physical Frames
└── Address translation (virtual → physical)
└── Protection (R/W, User/Supervisor)
↓ allocates from
Layer 0: Physical Frame Allocator
└── Track which 4KB frames are free/used
└── Bitmap or linked list of free frames
↓ queries
Hardware: Physical RAM (detected via E820 or UEFI memory map)
═══════════════════════════════════════════════════════════════
In this phase, we build Layers 0, 1, 2, and 3!
Physical Memory Management
Before we can implement virtual memory, we need to track physical memory: which 4KB chunks (frames) are free and which are used.
Memory Map
The BIOS provides a memory map via INT 0x15, EAX=0xE820 (the "E820" map). This tells us which regions of physical memory are usable:
TYPICAL PC MEMORY MAP (E820)
═══════════════════════════════════════════════════════════════
Address Size Type Description
───────────────────────────────────────────────────────────────
0x00000000-0x0009FBFF ~640 KB Usable Conventional memory
0x0009FC00-0x0009FFFF ~1 KB Reserved EBDA (Extended BIOS Data)
0x000A0000-0x000BFFFF 128 KB Reserved VGA memory (0xB8000 text)
0x000C0000-0x000FFFFF 256 KB Reserved ROM area (BIOS, option ROMs)
0x00100000-0x???????? Varies Usable Extended memory (RAM!)
... ... ... ...
0xFEC00000-0xFECFFFFF 64 KB Reserved IOAPIC
0xFEE00000-0xFEEFFFFF 64 KB Reserved Local APIC
0xFFFF0000-0xFFFFFFFF 64 KB Reserved ROM (mapped)
E820 Type Codes:
1 = Usable RAM (we can use this!)
2 = Reserved (don't touch)
3 = ACPI reclaimable (can use after ACPI init)
4 = ACPI NVS (never touch)
5 = Bad memory
/* e820.h - BIOS memory map structures */
#include <stdint.h>
/* E820 memory region types */
#define E820_USABLE 1
#define E820_RESERVED 2
#define E820_ACPI_RECL 3
#define E820_ACPI_NVS 4
#define E820_BAD 5
/* E820 memory region descriptor */
typedef struct {
uint64_t base; /* Start address */
uint64_t length; /* Size in bytes */
uint32_t type; /* Region type */
uint32_t acpi_ext; /* ACPI 3.0 extended attributes */
} __attribute__((packed)) e820_entry_t;
/* Stored by bootloader at known location */
#define E820_MAP_ADDR 0x8000 /* Where bootloader stores the map */
#define E820_COUNT_ADDR 0x7FFE /* Number of entries */
/* Get memory map entries */
e820_entry_t* e820_get_map(void);
uint32_t e820_get_count(void);
uint64_t e820_total_memory(void);
Frame Allocator
A frame is a 4KB chunk of physical memory. The frame allocator tracks which frames are free. There are two common approaches:
FRAME ALLOCATOR DESIGNS
═══════════════════════════════════════════════════════════════
BITMAP ALLOCATOR:
└── One bit per frame: 0 = free, 1 = used
└── Memory overhead: 4GB RAM / 4KB / 8 = 128 KB bitmap
└── Allocation: O(n) scan for free bit
└── Free: O(1) clear bit
Example: 128 MB RAM = 32,768 frames = 4,096 bytes bitmap
Bitmap Index: 0 1 2 3 4 5 6 7 ...
Frame Status: [ 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 ]
^ ^ ^ ^
used used used used
frame_first_free() returns frame 3 (0x3000)
STACK ALLOCATOR:
└── Push free frame addresses onto a stack
└── Pop to allocate, push to free
└── Allocation: O(1)
└── Free: O(1)
└── Memory overhead: 4 bytes per free frame
BUDDY ALLOCATOR:
└── Split/merge power-of-2 blocks
└── Better for large contiguous allocations
└── More complex, used by Linux
We'll use a bitmap allocator for simplicity. It's easy to understand and sufficient for a hobby OS. The bitmap approach clearly shows which regions are used.
/* pmm.c - Physical Memory Manager using bitmap */
#include <stdint.h>
#include <string.h>
#define PAGE_SIZE 4096
#define PAGES_PER_BYTE 8
/* Bitmap - each bit represents a 4KB frame */
static uint8_t* frame_bitmap;
static uint32_t total_frames;
static uint32_t used_frames;
static uint32_t bitmap_size;
/* Bitmap manipulation functions */
static inline void bitmap_set(uint32_t frame) {
frame_bitmap[frame / 8] |= (1 << (frame % 8));
}
static inline void bitmap_clear(uint32_t frame) {
frame_bitmap[frame / 8] &= ~(1 << (frame % 8));
}
static inline int bitmap_test(uint32_t frame) {
return frame_bitmap[frame / 8] & (1 << (frame % 8));
}
/**
* Initialize physical memory manager
* @param mem_size Total physical memory in bytes
* @param bitmap Location to place bitmap (physical address)
*/
void pmm_init(uint32_t mem_size, uint32_t bitmap_addr) {
total_frames = mem_size / PAGE_SIZE;
bitmap_size = total_frames / PAGES_PER_BYTE;
frame_bitmap = (uint8_t*)bitmap_addr;
/* Mark all frames as used initially */
memset(frame_bitmap, 0xFF, bitmap_size);
used_frames = total_frames;
}
/**
* Mark a region as free (from E820 map)
*/
void pmm_init_region(uint32_t base, uint32_t size) {
uint32_t start_frame = base / PAGE_SIZE;
uint32_t num_frames = size / PAGE_SIZE;
for (uint32_t i = 0; i < num_frames; i++) {
bitmap_clear(start_frame + i);
used_frames--;
}
/* Always keep frame 0 reserved (null pointer detection) */
bitmap_set(0);
}
/**
* Mark a region as used
*/
void pmm_deinit_region(uint32_t base, uint32_t size) {
uint32_t start_frame = base / PAGE_SIZE;
uint32_t num_frames = size / PAGE_SIZE;
for (uint32_t i = 0; i < num_frames; i++) {
bitmap_set(start_frame + i);
used_frames++;
}
}
/**
* Allocate a single physical frame
* @return Physical address of frame, or 0 if out of memory
*/
uint32_t pmm_alloc_frame(void) {
if (used_frames >= total_frames) {
return 0; /* Out of memory! */
}
/* Find first free frame */
for (uint32_t i = 0; i < bitmap_size; i++) {
if (frame_bitmap[i] != 0xFF) { /* At least one free bit */
for (uint32_t j = 0; j < 8; j++) {
if (!(frame_bitmap[i] & (1 << j))) {
uint32_t frame = i * 8 + j;
bitmap_set(frame);
used_frames++;
return frame * PAGE_SIZE;
}
}
}
}
return 0; /* Should never reach here if used_frames is accurate */
}
/**
* Free a physical frame
*/
void pmm_free_frame(uint32_t phys_addr) {
uint32_t frame = phys_addr / PAGE_SIZE;
if (!bitmap_test(frame)) {
return; /* Double free - frame already free */
}
bitmap_clear(frame);
used_frames--;
}
/**
* Get memory statistics
*/
uint32_t pmm_get_free_frames(void) {
return total_frames - used_frames;
}
uint32_t pmm_get_total_frames(void) {
return total_frames;
}
Initializing From E820 Map
Here's how to initialize the PMM from the BIOS memory map:
/**
* Initialize PMM from E820 memory map
*/
void pmm_init_from_e820(void) {
e820_entry_t* mmap = e820_get_map();
uint32_t count = e820_get_count();
/* Find total memory and highest usable address */
uint32_t total_mem = 0;
for (uint32_t i = 0; i < count; i++) {
if (mmap[i].type == E820_USABLE) {
uint32_t end = mmap[i].base + mmap[i].length;
if (end > total_mem) {
total_mem = end;
}
}
}
/* Place bitmap right after kernel */
extern uint32_t _kernel_end; /* From linker script */
uint32_t bitmap_addr = (uint32_t)&_kernel_end;
bitmap_addr = (bitmap_addr + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1); /* Align */
/* Initialize PMM (marks all as used) */
pmm_init(total_mem, bitmap_addr);
/* Mark usable regions as free */
for (uint32_t i = 0; i < count; i++) {
if (mmap[i].type == E820_USABLE) {
pmm_init_region(mmap[i].base, mmap[i].length);
}
}
/* Mark kernel + bitmap as used */
extern uint32_t _kernel_start;
uint32_t kernel_size = bitmap_addr + bitmap_size - (uint32_t)&_kernel_start;
pmm_deinit_region((uint32_t)&_kernel_start, kernel_size);
kprintf("PMM: %d MB total, %d MB free\n",
total_mem / (1024*1024),
pmm_get_free_frames() * PAGE_SIZE / (1024*1024));
}
Real-World Note: Linux's Physical Memory Management
Linux uses a sophisticated buddy allocator for physical memory, which efficiently handles allocations of various sizes (not just single pages). For single-page allocations, it maintains per-CPU page pools for performance. The concepts here—tracking free frames with a data structure—are the same.
Linux Kernel
Paging
Paging is the hardware mechanism that translates virtual addresses to physical addresses. On x86, the CPU's Memory Management Unit (MMU) does this automatically using data structures called page tables.
Understanding Page Tables
x86 32-BIT PAGING (TWO-LEVEL)
═══════════════════════════════════════════════════════════════
Virtual Address (32 bits):
┌──────────┬──────────┬────────────┐
│ Dir Idx │ Table Idx│ Offset │
│ 10 bits │ 10 bits │ 12 bits │
└────┬─────┴────┬─────┴─────┬──────┘
│ │ │
▼ │ │
┌─────────────┐ │ │
│ Page │ │ │
│ Directory │ │ │ CR3 Register
│ (1024 PTEs)│◄────────────────────────────┐
│ │ │ │ │
├─────────────┤ │ │ │
│ Entry │─┼──────┐ │ │
│ [Dir Idx] │ │ │ │ │
└─────────────┘ │ ▼ │ │
│ ┌─────────────┐ │
│ │ Page │ │
│ │ Table │ │
│ │ (1024 PTEs) │ │
│ ├─────────────┤ │
└>│ Entry │ │
│ [Table Idx] │───┐ │
└─────────────┘ │ │
▼ │
┌─────────┐ │
│ 4KB │ │
│ Frame │ │
│(physical)│ │
└────┬────┘ │
│ │
+ Offset │
│ │
▼ │
Physical CPU loads from CR3
Address! ─────────────────────
/* paging.h - Page table structures for x86 */
#include <stdint.h>
#define PAGE_SIZE 4096
#define PAGES_PER_TABLE 1024
#define TABLES_PER_DIR 1024
/* Page table entry flags */
#define PTE_PRESENT 0x001 /* Page is present in memory */
#define PTE_WRITABLE 0x002 /* Page is writable */
#define PTE_USER 0x004 /* Page accessible from user mode */
#define PTE_WRITETHROUGH 0x008 /* Write-through caching */
#define PTE_CACHEDISABLE 0x010 /* Cache disabled */
#define PTE_ACCESSED 0x020 /* Page has been accessed */
#define PTE_DIRTY 0x040 /* Page has been written to */
#define PTE_PAT 0x080 /* Page Attribute Table */
#define PTE_GLOBAL 0x100 /* Don't flush from TLB on CR3 change */
#define PTE_FRAME 0xFFFFF000 /* Frame address mask (bits 12-31) */
/* Page table entry - 4 bytes */
typedef uint32_t page_entry_t;
/* Page table - 4KB, holds 1024 entries */
typedef struct {
page_entry_t entries[PAGES_PER_TABLE];
} __attribute__((aligned(4096))) page_table_t;
/* Page directory - 4KB, holds 1024 table pointers */
typedef struct {
page_entry_t entries[TABLES_PER_DIR]; /* Physical addresses of tables */
page_table_t* tables[TABLES_PER_DIR]; /* Virtual addresses (for kernel use) */
uint32_t physical_addr; /* Physical address of this directory */
} __attribute__((aligned(4096))) page_directory_t;
/* Current page directory */
extern page_directory_t* current_directory;
/* Function prototypes */
void paging_init(void);
void paging_switch_directory(page_directory_t* dir);
page_directory_t* paging_get_directory(void);
void paging_map_page(uint32_t virtual, uint32_t physical, uint32_t flags);
void paging_unmap_page(uint32_t virtual);
uint32_t paging_get_physical(uint32_t virtual);
Page Directory Entry Format
PAGE DIRECTORY/TABLE ENTRY FORMAT
═══════════════════════════════════════════════════════════════
31 12 11 9 8 7 6 5 4 3 2 1 0
┌──────────────┬──────┬─┬─┬─┬─┬───┬───┬───┬───┬───┐
│ Frame/PT │ Avail│G│S│0│A│PCD│PWT│U/S│R/W│ P │
│ Address │ │ │ │ │ │ │ │ │ │ │
└──────────────┴──────┴─┴─┴─┴─┴───┴───┴───┴───┴───┘
Bit 0 (P): Present - entry is valid
Bit 1 (R/W): Read/Write - 0=read-only, 1=read-write
Bit 2 (U/S): User/Supervisor - 0=kernel only, 1=user accessible
Bit 3 (PWT): Page Write-Through
Bit 4 (PCD): Page Cache Disable
Bit 5 (A): Accessed - set by CPU when page is read
Bit 6 (D): Dirty - set by CPU when page is written (PTE only)
Bit 7 (S): Page Size - 1=4MB page (PDE only), 0=4KB
Bit 8 (G): Global - don't invalidate TLB entry on CR3 switch
Bits 9-11: Available for OS use
Bits 12-31: Frame address (4KB aligned, so bits 0-11 are zero)
Address Translation
When the CPU accesses a virtual address, here's exactly what happens:
ADDRESS TRANSLATION WALKTHROUGH
═══════════════════════════════════════════════════════════════
Example: Virtual address 0x00401234
Step 1: Split address into components
0x00401234 = 0000 0000 0100 0000 0001 0010 0011 0100
Directory Index: 0000 0000 01 = 1
Table Index: 00 0000 0001 = 1
Offset: 0010 0011 0100 = 0x234
Step 2: Read Page Directory entry [1]
CR3 → Page Directory physical address
PDE = PageDir[1]
If PDE.Present == 0 → PAGE FAULT!
Step 3: Read Page Table entry [1]
PTE = PageTable[1] (PageTable address from PDE)
If PTE.Present == 0 → PAGE FAULT!
Step 4: Calculate physical address
Physical = (PTE.Frame << 12) | Offset
Physical = (PTE.Frame << 12) | 0x234
Example with real values:
CR3 = 0x00100000 (Page Directory at 1MB)
PDE[1] = 0x00101003 (PT at 0x00101000, Present=1, RW=1)
PTE[1] = 0x00500003 (Frame at 0x00500000, Present=1, RW=1)
Physical = 0x00500000 | 0x234 = 0x00500234
TLB - Translation Lookaside Buffer: The CPU caches recent translations in the TLB. Without caching, every memory access would require 2 additional memory reads (directory + table). When you change page tables, you must invalidate the TLB with invlpg instruction or reload CR3.
Implementing Paging
/* paging.c - Paging implementation */
#include "paging.h"
#include "pmm.h"
#include <string.h>
/* Current active page directory */
page_directory_t* current_directory = NULL;
page_directory_t* kernel_directory = NULL;
/**
* Map a virtual address to a physical address
*/
void paging_map_page(uint32_t virtual, uint32_t physical, uint32_t flags) {
/* Calculate indices */
uint32_t dir_idx = virtual >> 22; /* Top 10 bits */
uint32_t table_idx = (virtual >> 12) & 0x3FF; /* Middle 10 bits */
/* Get or create page table */
if (!(current_directory->entries[dir_idx] & PTE_PRESENT)) {
/* Allocate a new page table */
uint32_t pt_phys = pmm_alloc_frame();
if (!pt_phys) {
/* Out of memory! */
return;
}
/* Map to kernel space for access */
page_table_t* pt = (page_table_t*)pt_phys; /* Identity mapped initially */
memset(pt, 0, sizeof(page_table_t));
/* Install in directory */
current_directory->entries[dir_idx] = pt_phys | PTE_PRESENT | PTE_WRITABLE | flags;
current_directory->tables[dir_idx] = pt;
}
/* Get page table */
page_table_t* table = current_directory->tables[dir_idx];
/* Set the page table entry */
table->entries[table_idx] = (physical & PTE_FRAME) | PTE_PRESENT | flags;
/* Invalidate TLB for this page */
__asm__ volatile("invlpg (%0)" : : "r"(virtual) : "memory");
}
/**
* Unmap a virtual address
*/
void paging_unmap_page(uint32_t virtual) {
uint32_t dir_idx = virtual >> 22;
uint32_t table_idx = (virtual >> 12) & 0x3FF;
if (!(current_directory->entries[dir_idx] & PTE_PRESENT)) {
return; /* Not mapped */
}
page_table_t* table = current_directory->tables[dir_idx];
table->entries[table_idx] = 0;
/* Invalidate TLB */
__asm__ volatile("invlpg (%0)" : : "r"(virtual) : "memory");
}
/**
* Enable paging with the given page directory
*/
void paging_enable(page_directory_t* dir) {
current_directory = dir;
/* Load page directory address into CR3 */
__asm__ volatile("mov %0, %%cr3" : : "r"(dir->physical_addr));
/* Enable paging (set PG bit in CR0) */
uint32_t cr0;
__asm__ volatile("mov %%cr0, %0" : "=r"(cr0));
cr0 |= 0x80000000;
__asm__ volatile("mov %0, %%cr0" : : "r"(cr0));
}
/**
* Initialize paging for the kernel
*/
void paging_init(void) {
/* Allocate kernel page directory */
kernel_directory = (page_directory_t*)pmm_alloc_frame();
memset(kernel_directory, 0, sizeof(page_directory_t));
kernel_directory->physical_addr = (uint32_t)kernel_directory;
current_directory = kernel_directory;
/* Identity map first 4MB (kernel + essential hardware) */
for (uint32_t addr = 0; addr < 0x400000; addr += PAGE_SIZE) {
paging_map_page(addr, addr, PTE_WRITABLE);
}
/* Enable paging */
paging_enable(kernel_directory);
kprintf("Paging enabled!\n");
}
Critical: Identity Mapping
When you first enable paging, the CPU's instruction pointer (EIP) contains a physical address. If your page tables don't map that address to itself (identity mapping), the CPU immediately page faults trying to fetch the next instruction!
Always identity-map the kernel and enable-paging code before enabling paging.
Common Bug
Virtual Memory
With paging enabled, we can now create virtual address spaces for different purposes and eventually for different processes.
Address Space Layout
TYPICAL 32-BIT ADDRESS SPACE LAYOUT
═══════════════════════════════════════════════════════════════
0xFFFFFFFF ┌─────────────────────────────────┐
│ │
│ Kernel Space │
│ (Shared by all │
│ processes) │
│ │
0xC0000000 ├─────────────────────────────────┤ ← 3GB boundary
│ │
│ User Stack │
│ (grows down) │
│ ↓ │
│ [unmapped] │
│ ↑ │
│ User Heap │
│ (grows up) │
│ │
│ User Data (.data,.bss) │
│ User Code (.text) │
│ │
0x08048000 ├─────────────────────────────────┤ ← Typical .text start
│ [unmapped - catches │
│ NULL pointer derefs] │
0x00000000 └─────────────────────────────────┘
Linux "3/1 split": 3 GB user, 1 GB kernel
Some OSes use 2/2 split (Windows 32-bit default)
/* vmm.h - Virtual Memory Manager */
#define KERNEL_VIRTUAL_BASE 0xC0000000 /* 3GB mark */
#define USER_STACK_TOP 0xBFFFF000 /* Just below kernel */
#define USER_HEAP_START 0x20000000 /* 512MB mark */
/**
* Convert physical address to kernel virtual address
* (Only works for identity-mapped or higher-half kernel)
*/
#define PHYS_TO_VIRT(addr) ((uint32_t)(addr) + KERNEL_VIRTUAL_BASE)
#define VIRT_TO_PHYS(addr) ((uint32_t)(addr) - KERNEL_VIRTUAL_BASE)
/**
* Create a new address space (for a new process)
*/
page_directory_t* vmm_create_address_space(void);
/**
* Destroy an address space (on process exit)
*/
void vmm_destroy_address_space(page_directory_t* dir);
/**
* Map a range of virtual memory for user space
*/
int vmm_map_user_pages(page_directory_t* dir, uint32_t vaddr,
uint32_t size, uint32_t flags);
Handling Page Faults
A page fault occurs when the CPU tries to access a page that isn't properly mapped. Our page fault handler (ISR 14) must decide what to do:
/* page_fault.c - Page fault handler */
#include "paging.h"
#include "pmm.h"
#include "registers.h"
/**
* Page fault handler (ISR 14)
*
* Error code bits:
* Bit 0 (P): 0 = not present, 1 = protection violation
* Bit 1 (W): 0 = read, 1 = write
* Bit 2 (U): 0 = supervisor, 1 = user mode
* Bit 3 (R): 1 = reserved bit violation
* Bit 4 (I): 1 = instruction fetch
*/
void page_fault_handler(registers_t* regs) {
/* Get faulting address from CR2 */
uint32_t faulting_address;
__asm__ volatile("mov %%cr2, %0" : "=r"(faulting_address));
/* Decode error code */
int present = regs->err_code & 0x1; /* Page was present */
int write = regs->err_code & 0x2; /* Write operation */
int user = regs->err_code & 0x4; /* User mode */
int reserved = regs->err_code & 0x8; /* Reserved bit set in PTE */
/* Demand paging: If page not present, allocate it */
if (!present) {
/* Check if this is a valid address for this process */
if (vmm_is_valid_address(faulting_address)) {
/* Allocate a frame and map it */
uint32_t frame = pmm_alloc_frame();
if (frame) {
uint32_t flags = PTE_PRESENT | PTE_WRITABLE;
if (user) flags |= PTE_USER;
paging_map_page(faulting_address & ~0xFFF, frame, flags);
return; /* Resume execution */
}
}
}
/* Unrecoverable page fault */
kprintf("\n=== PAGE FAULT ===\n");
kprintf("Address: 0x%08X\n", faulting_address);
kprintf("Error: %s %s %s%s\n",
present ? "protection" : "not-present",
write ? "write" : "read",
user ? "user" : "kernel",
reserved ? " reserved-bit" : "");
kprintf("EIP: 0x%08X\n", regs->eip);
/* Kernel page fault is fatal */
if (!user) {
kprintf("KERNEL PANIC: Page fault in kernel mode!\n");
for (;;) __asm__ volatile("hlt");
}
/* User page fault: terminate process (or send SIGSEGV) */
kprintf("Terminating process due to page fault.\n");
/* TODO: process_exit(-1); */
}
Demand Paging
Demand paging means we don't actually allocate physical memory until it's accessed. This saves memory—a process can have a large virtual address space but only use memory for pages it touches.
DEMAND PAGING FLOW
═══════════════════════════════════════════════════════════════
1. Process allocated 1 MB heap (virtual)
┌──────────────────────────┐
│ Page tables created │
│ but PTE.Present = 0 │
│ (no physical frames │
│ allocated yet!) │
└──────────────────────────┘
2. Process writes to address in heap
┌──────────────────────────┐
│ CPU checks PTE │
│ Present = 0 │
│ → PAGE FAULT! │
└──────────────────────────┘
3. Page fault handler allocates frame
┌──────────────────────────┐
│ pmm_alloc_frame() │
│ Map virtual → physical │
│ Set PTE.Present = 1 │
│ Return to instruction │
└──────────────────────────┘
4. Instruction retries and succeeds!
┌──────────────────────────┐
│ Write completes │
│ Process continues │
└──────────────────────────┘
This is how processes can have huge virtual
address spaces without using all RAM!
Copy-on-Write (COW): When fork() creates a new process, both parent and child initially share the same physical pages (marked read-only). When either writes, a page fault triggers copying that page. This makes fork() fast and memory-efficient.
Heap Allocator (kmalloc/kfree)
With paging working, we can finally implement dynamic memory allocation: the kernel equivalent of malloc() and free().
Heap Design
HEAP ALLOCATOR DESIGN
═══════════════════════════════════════════════════════════════
The heap is a region of virtual memory that grows as needed.
We'll use a LINKED LIST of blocks (simple but slow for large heaps).
BLOCK HEADER STRUCTURE:
┌─────────────────────────────────────────────────────────────┐
│ Header │ User Data │
├───────────┼─────────────────────────────────────────────────┤
│ size: 4B │ │
│ free: 1B │ size bytes of usable memory │
│ next: 4B │ │
│ (9 bytes) │ │
└───────────┴─────────────────────────────────────────────────┘
HEAP LAYOUT:
heap_start heap_end
↓ ↓
┌────────┬──────────┬────────┬──────────┬────────┬─────────┐
│ Header │ Used │ Header │ Free │ Header │ Used │
│ (9B) │ (128B) │ (9B) │ (256B) │ (9B) │ (64B) │
└────────┴──────────┴────────┴──────────┴────────┴─────────┘
└─────────────────→└─────────────────→└─────────→ NULL
ALLOCATION (First-Fit):
1. Walk the linked list
2. Find first FREE block with size >= requested
3. If much larger, split into two blocks
4. Mark as used, return pointer after header
DEALLOCATION:
1. Get block header (pointer - sizeof(header))
2. Mark as free
3. Coalesce with adjacent free blocks
kmalloc Implementation
/* heap.c - Kernel heap allocator */
#include <stdint.h>
#include <string.h>
#include "paging.h"
#include "pmm.h"
/* Heap block header */
typedef struct heap_block {
uint32_t size; /* Size of user data (not including header) */
uint8_t is_free; /* 1 if free, 0 if in use */
struct heap_block* next; /* Next block in list */
} __attribute__((packed)) heap_block_t;
#define HEAP_HEADER_SIZE sizeof(heap_block_t)
#define HEAP_MIN_SPLIT 64 /* Don't split if remainder < this */
/* Heap boundaries */
static heap_block_t* heap_start = NULL;
static heap_block_t* heap_end = NULL;
static uint32_t heap_current = 0;
static uint32_t heap_max = 0;
/**
* Initialize kernel heap
*/
void heap_init(uint32_t start, uint32_t initial_size, uint32_t max_size) {
heap_current = start;
heap_max = start + max_size;
/* Map initial pages */
for (uint32_t addr = start; addr < start + initial_size; addr += PAGE_SIZE) {
uint32_t frame = pmm_alloc_frame();
paging_map_page(addr, frame, PTE_WRITABLE);
}
/* Create initial free block */
heap_start = (heap_block_t*)start;
heap_start->size = initial_size - HEAP_HEADER_SIZE;
heap_start->is_free = 1;
heap_start->next = NULL;
heap_end = heap_start;
kprintf("Heap initialized: %d KB at 0x%08X\n", initial_size/1024, start);
}
/**
* Expand heap by mapping more pages
*/
static int heap_expand(uint32_t size) {
uint32_t new_end = heap_current + size;
if (new_end > heap_max) {
return 0; /* Can't expand past max */
}
/* Map new pages */
for (uint32_t addr = heap_current; addr < new_end; addr += PAGE_SIZE) {
if (addr % PAGE_SIZE == 0) {
uint32_t frame = pmm_alloc_frame();
if (!frame) return 0; /* Out of physical memory */
paging_map_page(addr, frame, PTE_WRITABLE);
}
}
heap_current = new_end;
return 1;
}
/**
* Allocate memory from kernel heap
*/
void* kmalloc(uint32_t size) {
if (size == 0) return NULL;
/* Align to 4 bytes */
size = (size + 3) & ~3;
/* First-fit: find a free block */
heap_block_t* current = heap_start;
while (current) {
if (current->is_free && current->size >= size) {
/* Found a suitable block */
/* Split if significantly larger */
if (current->size > size + HEAP_HEADER_SIZE + HEAP_MIN_SPLIT) {
heap_block_t* new_block = (heap_block_t*)
((uint8_t*)current + HEAP_HEADER_SIZE + size);
new_block->size = current->size - size - HEAP_HEADER_SIZE;
new_block->is_free = 1;
new_block->next = current->next;
current->size = size;
current->next = new_block;
if (current == heap_end) {
heap_end = new_block;
}
}
current->is_free = 0;
return (void*)((uint8_t*)current + HEAP_HEADER_SIZE);
}
current = current->next;
}
/* No suitable block found - expand heap */
uint32_t expand_size = size + HEAP_HEADER_SIZE;
if (expand_size < PAGE_SIZE) expand_size = PAGE_SIZE;
if (!heap_expand(expand_size)) {
return NULL; /* Out of memory */
}
/* Add new block at end of heap */
heap_block_t* new_block;
if (heap_end->is_free) {
/* Extend existing free block */
heap_end->size += expand_size;
new_block = heap_end;
} else {
/* Create new block */
new_block = (heap_block_t*)
((uint8_t*)heap_end + HEAP_HEADER_SIZE + heap_end->size);
new_block->size = expand_size - HEAP_HEADER_SIZE;
new_block->is_free = 1;
new_block->next = NULL;
heap_end->next = new_block;
heap_end = new_block;
}
/* Recursive call to allocate from the new block */
return kmalloc(size);
}
kfree Implementation
/**
* Coalesce adjacent free blocks
*/
static void heap_coalesce(void) {
heap_block_t* current = heap_start;
while (current && current->next) {
if (current->is_free && current->next->is_free) {
/* Merge with next block */
current->size += HEAP_HEADER_SIZE + current->next->size;
current->next = current->next->next;
if (!current->next) {
heap_end = current;
}
/* Don't advance - check for more merges */
} else {
current = current->next;
}
}
}
/**
* Free memory back to kernel heap
*/
void kfree(void* ptr) {
if (!ptr) return;
/* Get block header */
heap_block_t* block = (heap_block_t*)
((uint8_t*)ptr - HEAP_HEADER_SIZE);
/* Validate pointer is within heap */
if ((uint32_t)block < (uint32_t)heap_start ||
(uint32_t)block > heap_current) {
kprintf("WARNING: kfree() on invalid pointer 0x%08X\n", ptr);
return;
}
/* Check for double-free */
if (block->is_free) {
kprintf("WARNING: Double free at 0x%08X\n", ptr);
return;
}
/* Mark as free */
block->is_free = 1;
/* Coalesce adjacent free blocks */
heap_coalesce();
}
/**
* Allocate and zero memory
*/
void* kcalloc(uint32_t count, uint32_t size) {
uint32_t total = count * size;
void* ptr = kmalloc(total);
if (ptr) {
memset(ptr, 0, total);
}
return ptr;
}
/**
* Reallocate memory
*/
void* krealloc(void* ptr, uint32_t new_size) {
if (!ptr) return kmalloc(new_size);
if (new_size == 0) {
kfree(ptr);
return NULL;
}
heap_block_t* block = (heap_block_t*)
((uint8_t*)ptr - HEAP_HEADER_SIZE);
if (block->size >= new_size) {
return ptr; /* Current block is big enough */
}
/* Allocate new, copy, free old */
void* new_ptr = kmalloc(new_size);
if (new_ptr) {
memcpy(new_ptr, ptr, block->size);
kfree(ptr);
}
return new_ptr;
}
Fragmentation
FRAGMENTATION PROBLEM
═══════════════════════════════════════════════════════════════
INTERNAL FRAGMENTATION:
Allocated more than needed due to alignment/minimum size
Request: 50 bytes
Actual: 64 bytes (aligned to 16)
Wasted: 14 bytes (internal fragmentation)
EXTERNAL FRAGMENTATION:
Free memory exists but is non-contiguous
┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐
│ USED │ free │ USED │ free │ USED │ free │ USED │ free │
│ 64B │ 32B │ 64B │ 32B │ 64B │ 32B │ 64B │ 32B │
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘
Total free: 128 bytes
But can't allocate 128-byte block! (only 32-byte holes)
SOLUTIONS:
1. Coalescing: Merge adjacent free blocks (we do this)
2. Compaction: Move blocks to create contiguous space (expensive)
3. Best-fit: Choose smallest adequate block (reduces waste)
4. Slab allocator: Pre-sized pools for common sizes (Linux)
Advanced: Slab Allocator
Linux uses a slab allocator for kernel objects. Pre-allocated pools of fixed-size objects (inodes, task_structs, etc.) eliminate fragmentation and speed up allocation. Consider implementing this for frequently-allocated kernel structures.
Advanced Topic
What You Can Build
Phase 6 Project: A kernel with working malloc() and free()! Your OS can now allocate memory dynamically at runtime, manage physical frames, and use virtual memory for process isolation.
Project: Memory Statistics Display
Build a memory dashboard that displays real-time memory usage:
/* memstat.c - Memory statistics display */
#include "pmm.h"
#include "heap.h"
void display_memory_stats(void) {
/* Physical memory stats */
uint32_t total_frames = pmm_get_total_frames();
uint32_t free_frames = pmm_get_free_frames();
uint32_t used_frames = total_frames - free_frames;
uint32_t total_mb = (total_frames * PAGE_SIZE) / (1024 * 1024);
uint32_t used_mb = (used_frames * PAGE_SIZE) / (1024 * 1024);
uint32_t free_mb = (free_frames * PAGE_SIZE) / (1024 * 1024);
uint32_t percent_used = (used_frames * 100) / total_frames;
kprintf("\n╔══════════════════════════════════════════╗\n");
kprintf("║ MEMORY STATISTICS ║\n");
kprintf("╠══════════════════════════════════════════╣\n");
kprintf("║ PHYSICAL MEMORY ║\n");
kprintf("║ Total: %4d MB (%6d frames) ║\n", total_mb, total_frames);
kprintf("║ Used: %4d MB (%6d frames) ║\n", used_mb, used_frames);
kprintf("║ Free: %4d MB (%6d frames) ║\n", free_mb, free_frames);
kprintf("║ Usage: [");
/* ASCII progress bar */
int bar_width = 20;
int filled = (percent_used * bar_width) / 100;
for (int i = 0; i < bar_width; i++) {
kprintf(i < filled ? "█" : "░");
}
kprintf("] %3d%% ║\n", percent_used);
kprintf("╠══════════════════════════════════════════╣\n");
kprintf("║ KERNEL HEAP ║\n");
kprintf("║ Allocated: %6d bytes ║\n", heap_get_used());
kprintf("║ Fragments: %6d blocks ║\n", heap_get_block_count());
kprintf("╚══════════════════════════════════════════╝\n");
}
/* Stress test the allocator */
void test_allocator(void) {
kprintf("\nAllocator stress test...\n");
void* ptrs[100];
/* Allocate many blocks */
for (int i = 0; i < 100; i++) {
ptrs[i] = kmalloc((i + 1) * 16);
if (!ptrs[i]) {
kprintf(" Allocation %d failed!\n", i);
break;
}
}
kprintf(" Allocated 100 blocks\n");
display_memory_stats();
/* Free every other block (creates fragmentation) */
for (int i = 0; i < 100; i += 2) {
kfree(ptrs[i]);
ptrs[i] = NULL;
}
kprintf("\n Freed 50 blocks (alternating)\n");
display_memory_stats();
/* Free remaining */
for (int i = 1; i < 100; i += 2) {
kfree(ptrs[i]);
}
kprintf("\n Freed remaining blocks\n");
display_memory_stats();
}
Exercises
Exercise 1: Aligned Allocation
Implement kmalloc_aligned(size, alignment) that returns memory aligned to a specified boundary (e.g., 16, 256, or 4096 bytes). This is essential for DMA buffers and SIMD operations.
Hint: Allocate extra space, then return a pointer that's properly aligned. Store the original pointer somewhere so kfree() can find it.
Intermediate
Exercise 2: Memory Map Command
Add a memmap shell command that displays the E820 memory map regions, showing which areas are usable RAM, reserved, ACPI, etc.
Easy
Exercise 3: Page Frame Database
Replace the bitmap allocator with a page frame database: an array of struct page entries, one per physical frame. Each entry tracks: reference count, flags (dirty, locked), and a pointer for free list linking. This is how Linux manages physical memory.
Advanced
Next Steps
With memory management in place, we can now build persistent storage. In Phase 7, we'll implement disk I/O and a filesystem so your kernel can read and write files.
PHASE 7 PREVIEW: FILESYSTEM
═══════════════════════════════════════════════════════════════
Your kernel can now:
✓ Manage physical memory (PMM with bitmap)
✓ Translate virtual addresses (paging)
✓ Allocate memory dynamically (heap/kmalloc)
Next, you'll build:
┌─────────────────────────────────────────────────────────┐
│ VFS (Virtual File System) │
│ Unified interface for all filesystems │
└────────────────────────┬────────────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌──────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
│ FAT │ │ ext2 │ │ tmpfs │
│ (Floppy/ │ │ (Linux) │ │ (RAM disk) │
│ USB) │ │ │ │ │
└──────┬─────┘ └──────┬─────┘ └──────┬─────┘
│ │ │
┌──────▼─────────────▼─────────────▼──────┐
│ Block Device Layer │
│ (Disk read/write operations) │
└────────────────────┬────────────────────┘
│
▼
┌───────────────────────────────────────────┐
│ ATA/IDE Disk Driver │
│ (PIO mode for simplicity) │
└───────────────────────────────────────────┘
You'll implement:
• ATA PIO disk driver (read/write 512-byte sectors)
• FAT12/16 filesystem (simple, well-documented)
• Basic VFS operations (open, read, write, close)
• Directory listing and file creation
Key Takeaways from Phase 6:
- Layered Design: PMM → Paging → VMM → Heap - each layer builds on the previous
- Virtual Memory Benefits: Process isolation, memory overcommit, simpler executable loading
- Two-Level Paging: Page directory + page tables provide efficient address translation
- Demand Paging: Allocate physical memory only when accessed (page fault triggers allocation)
- Heap Management: First-fit allocation with coalescing handles fragmentation
- Identity Mapping: Map BIOS regions and kernel 1:1 for bootstrap simplicity
Continue the Series
Phase 5: Interrupts & CPU Control
Review the IDT, ISRs, PIC programming, and timer setup.
Read Article
Phase 7: Disk Access & Filesystems
Implement ATA disk drivers, FAT filesystem, and the VFS abstraction layer.
Read Article
Phase 8: Processes & User Mode
Implement task switching, system calls, and run code in user space.
Read Article