We use cookies to enhance your browsing experience, serve personalized content, and analyze our traffic.
By clicking "Accept All", you consent to our use of cookies. See our
Privacy Policy
for more information.
Phase 15 Goals: By the end of this phase, your OS will discover hardware through PCI enumeration, have a driver framework, and support modern storage via AHCI (SATA) and NVMe.
Your OS is now graphical and interactive. But all that software runs on hardware—and modern PCs have a lot of it! How does your OS know what devices are connected? How does it talk to an SSD, a network card, or a GPU?
The answer is a two-part system: hardware discovery (finding what's connected) and device drivers (knowing how to talk to each device). This phase covers both.
Hardware discovery and driver architecture — from PCI enumeration to device-specific drivers
Key Insight: Hardware discovery is the foundation of extensibility. PCI enumeration lets your OS find and configure devices automatically, while a proper driver framework allows modular hardware support.
Hardware Landscape
Modern PCs communicate with hardware through several interfaces:
Bus
Typical Devices
Speed
Discovery
PCI/PCIe
GPUs, NICs, NVMe SSDs, SATA controllers
Up to 64 GB/s (PCIe 5.0 x16)
Configuration Space scan
USB
Keyboards, mice, storage, cameras
Up to 20 Gbps (USB 3.2)
Hub enumeration
SATA
HDDs, older SSDs, optical drives
6 Gbps (SATA III)
Port detection via AHCI
LPC
Legacy devices (PS/2, TPM)
~33 MHz
ACPI/Hardcoded
For this phase, we focus on PCI/PCIe enumeration—the primary discovery mechanism for high-performance devices.
Driver Model
A driver is software that knows how to "speak" a specific device's protocol. Good OS design separates:
╔═════════════════════════════════════════════════════════════════════════════╗║ OS DRIVER MODEL ║╠═════════════════════════════════════════════════════════════════════════════╣║║║┌─────────────────────────────────────────────────────────────────────┐║║│Generic Block Device API│║║│ read_blocks(), write_blocks(), get_info() │║║└─────────────────────────────┬───────────────────────────────────────┘║║ │ ║║ ┌───────────────────┼───────────────────┐ ║║ │ │ │ ║║┌────────┴────────┐ ┌────────┴────────┐ ┌────────┴────────┐║║│NVMe Driver│ │AHCI Driver│ │IDE Driver│║║│ Implements API │ │ Implements API │ │ Implements API │║║│ for NVMe SSDs │ │ for SATA │ │ for legacy IDE │║║└─────────────────┘ └─────────────────┘ └─────────────────┘║║║║Benefits:║║ • File system doesn't care which storage technology is used ║║ • New drivers can be added without changing upper layers ║║ • Same code works on any hardware that has a driver ║║║╚═════════════════════════════════════════════════════════════════════════════╝
This abstraction is critical. Your file system calls block_read(), and the driver translates that to the specific commands the hardware understands.
Driver Lifecycle:
Registration - Driver tells the kernel what devices it supports
Probe - Kernel asks "can you handle this device?"
Attach - Driver initializes the hardware
Operation - Driver handles requests
Detach - Cleanup when device removed (hot-unplug)
PCI Enumeration
PCI (Peripheral Component Interconnect) is the standard bus for high-speed devices. Every PCI device has a 256-byte Configuration Space that describes what it is and how to talk to it.
PCI configuration space — 256-byte descriptor with vendor/device IDs, BARs, and capability lists
To discover all devices, scan every Bus:Slot:Function combination. If Vendor ID is 0xFFFF, no device is present:
/* PCI Device Structure */
typedef struct pci_device {
uint8_t bus, slot, func;
uint16_t vendor_id;
uint16_t device_id;
uint8_t class_code;
uint8_t subclass;
uint8_t prog_if;
uint32_t bar[6];
uint8_t interrupt_line;
struct pci_device* next;
} pci_device_t;
pci_device_t* pci_devices = NULL;
/* Scan all PCI buses */
void pci_scan(void) {
for (uint16_t bus = 0; bus < 256; bus++) {
for (uint8_t slot = 0; slot < 32; slot++) {
for (uint8_t func = 0; func < 8; func++) {
uint32_t id = pci_read(bus, slot, func, 0);
uint16_t vendor = id & 0xFFFF;
if (vendor == 0xFFFF) continue; // No device
// Found device - add to list
pci_device_t* dev = kmalloc(sizeof(pci_device_t));
dev->bus = bus;
dev->slot = slot;
dev->func = func;
dev->vendor_id = vendor;
dev->device_id = id >> 16;
// Read class info
uint32_t class_reg = pci_read(bus, slot, func, 0x08);
dev->class_code = (class_reg >> 24) & 0xFF;
dev->subclass = (class_reg >> 16) & 0xFF;
dev->prog_if = (class_reg >> 8) & 0xFF;
// Read BARs
for (int i = 0; i < 6; i++) {
dev->bar[i] = pci_read(bus, slot, func, 0x10 + i * 4);
}
// Read interrupt line
dev->interrupt_line = pci_read(bus, slot, func, 0x3C) & 0xFF;
// Add to list
dev->next = pci_devices;
pci_devices = dev;
kprintf("PCI: %02x:%02x.%x - %04x:%04x Class %02x:%02x\n",
bus, slot, func, vendor, dev->device_id,
dev->class_code, dev->subclass);
// Check for multi-function device
if (func == 0) {
uint32_t header = pci_read(bus, slot, 0, 0x0C);
if (!((header >> 16) & 0x80)) break;
}
}
}
}
}
BAR Decoding
BARs (Base Address Registers) tell us where the device's memory or I/O registers are mapped. But they also encode the region size—with a clever trick:
╔═════════════════════════════════════════════════════════════════════════════╗║ BAR FORMAT AND SIZE DETECTION ║╠═════════════════════════════════════════════════════════════════════════════╣║║║Memory BAR (bit 0 = 0):║║┌───────────────────────────────────────┬───┬───┬───┐║║│Base Address (bits 4-31)│Prf│Typ│0│║║└───────────────────────────────────────┴───┴───┴───┘║║│ │ └─ Memory space║║│ └──── Type: 00=32-bit║║│ 10=64-bit║║└───── Prefetchable║║║║I/O BAR (bit 0 = 1):║║┌───────────────────────────────────────────────┬───┐║║│I/O Port (bits 2-31)│1│║║└───────────────────────────────────────────────┴───┘║║║║Size Detection Algorithm:║║1. Save original BAR value║║2. Write all 1s (0xFFFFFFFF) to BAR║║3. Read back - hardware sets writable bits║║4. Invert, add 1 = region size║║5. Restore original BAR value║║║╚═════════════════════════════════════════════════════════════════════════════╝
Important: Modern systems use PCIe ECAM (Enhanced Configuration Access Mechanism) instead of I/O ports. ECAM memory-maps the entire configuration space, found via ACPI's MCFG table.
Driver Framework
A driver framework is the glue between discovered hardware and the rest of the OS. It defines how drivers register themselves, how they're matched to devices, and how they export functionality.
Driver framework — registration, device matching, and probe lifecycle
/* Generic device node */
typedef struct device {
char name[32];
struct device* parent;
struct device* children;
struct device* sibling;
driver_t* driver;
void* driver_data; // Driver-specific state
enum { DEV_PCI, DEV_USB, DEV_ACPI } bus_type;
union {
pci_device_t* pci;
// usb_device_t* usb;
} bus_data;
} device_t;
device_t* device_root = NULL;
/* Add device to tree */
void device_add(device_t* dev, device_t* parent) {
dev->parent = parent;
dev->sibling = parent->children;
parent->children = dev;
}
/* Print device tree (recursive) */
void device_tree_print(device_t* dev, int depth) {
for (int i = 0; i < depth; i++) kprintf(" ");
kprintf("├── %s", dev->name);
if (dev->driver) kprintf(" [%s]", dev->driver->name);
kprintf("\n");
for (device_t* child = dev->children; child; child = child->sibling) {
device_tree_print(child, depth + 1);
}
}
/* Find device by path (e.g., "pci0000:00/00:1c.0/01:00.0") */
device_t* device_find(const char* path) {
// Parse path and walk tree
device_t* current = device_root;
// ... implementation
return current;
}
Real-World Pattern: Linux's /sys/devices/ exposes the device tree to userspace. Your shell can implement lspci and lsusb by walking this tree!
AHCI Storage
AHCI (Advanced Host Controller Interface) is the standard interface for SATA devices—hard drives and older SSDs. It's more complex than legacy IDE but supports NCQ (Native Command Queuing) for better performance.
AHCI architecture — HBA ports, command lists, and FIS-based SATA communication
NCQ (Native Command Queuing): AHCI supports up to 32 outstanding commands per port. This enables the drive to reorder operations for optimal head movement (HDDs) or parallelism (SSDs).
NVMe Storage
NVMe (Non-Volatile Memory Express) is the modern interface for SSDs, designed from scratch for flash storage over PCIe. Unlike AHCI (which was designed for spinning disks), NVMe uses multiple parallel queues to achieve incredible performance—millions of IOPS!
NVMe architecture — paired submission/completion queues for parallel I/O over PCIe
NVMe's key innovation is the queue-based command interface. Instead of one command at a time (like IDE), or 32 commands (like AHCI), NVMe supports up to 65,535 I/O queue pairs, each with up to 65,536 entries!
Why So Many Queues? Modern SSDs have massive internal parallelism (hundreds of flash chips). Multiple queues let the OS feed commands from different CPU cores simultaneously, fully saturating the SSD's bandwidth.
The phase bit in the completion entry is clever: it flips between 0 and 1 each time the queue wraps. This lets the driver detect new completions without needing explicit notifications for each one.
NVMe Commands
NVMe commands are classified into two types:
Type
Queue
Examples
Admin
Admin SQ/CQ
Create I/O Queue, Identify, Get Log Page, Set Features
I/O
I/O SQ/CQ pairs
Read, Write, Flush, Dataset Management (TRIM)
Before issuing I/O commands, you must use Admin commands to identify the controller and create I/O queues:
NVMe vs AHCI Performance: While AHCI supports 32 commands with one queue, NVMe can handle 65K commands across 65K queues. This translates to 500K+ IOPS vs AHCI's ~100K IOPS on the same SSD!
What You Can Build
Phase 15 Project: A hardware-aware OS! Your system now discovers PCI devices automatically, has a pluggable driver framework, and can access modern NVMe SSDs at full speed.
Let's combine everything into a complete storage subsystem demo:
/* storage_demo.c - Complete Storage Subsystem */
#include "pci.h"
#include "ahci.h"
#include "nvme.h"
#include "block.h"
/* Generic Block Device Interface */
typedef struct block_device {
char name[32];
uint64_t total_sectors;
uint32_t sector_size;
int (*read)(struct block_device* dev, uint64_t lba,
uint32_t count, void* buffer);
int (*write)(struct block_device* dev, uint64_t lba,
uint32_t count, const void* buffer);
void* private; // Driver-specific data
} block_device_t;
static block_device_t devices[MAX_BLOCK_DEVICES];
static int num_devices = 0;
/* Register a block device */
void block_register(block_device_t* dev) {
if (num_devices < MAX_BLOCK_DEVICES) {
devices[num_devices++] = *dev;
kprintf("Block: Registered %s (%llu MB)\n",
dev->name,
(dev->total_sectors * dev->sector_size) / (1024*1024));
}
}
/* Storage subsystem initialization */
void storage_init(void) {
kprintf("Storage: Scanning for devices...\n");
// Scan PCI bus for storage controllers
pci_device_t* dev;
while ((dev = pci_next_device()) != NULL) {
// Check class code
if (dev->class_code != 0x01) continue; // Mass Storage
switch (dev->subclass) {
case 0x06: // AHCI (SATA)
if (dev->prog_if == 0x01) {
kprintf("Storage: Found AHCI controller at %02x:%02x.%x\n",
dev->bus, dev->slot, dev->func);
ahci_init(dev);
}
break;
case 0x08: // NVMe
if (dev->prog_if == 0x02) {
kprintf("Storage: Found NVMe controller at %02x:%02x.%x\n",
dev->bus, dev->slot, dev->func);
nvme_init(dev);
}
break;
case 0x01: // IDE (legacy)
kprintf("Storage: Found IDE controller (legacy)\n");
// ide_init(dev);
break;
}
}
kprintf("Storage: Found %d block device(s)\n", num_devices);
}
/* Read from any block device */
int block_read(int device_id, uint64_t lba, uint32_t count, void* buffer) {
if (device_id >= num_devices) return -1;
return devices[device_id].read(&devices[device_id], lba, count, buffer);
}
/* Demo: Read first sector of each device */
void storage_demo(void) {
uint8_t sector[512];
for (int i = 0; i < num_devices; i++) {
kprintf("\nReading sector 0 from %s:\n", devices[i].name);
if (block_read(i, 0, 1, sector) == 0) {
// Check for MBR signature
if (sector[510] == 0x55 && sector[511] == 0xAA) {
kprintf(" Found MBR partition table\n");
// Parse partition entries
for (int p = 0; p < 4; p++) {
uint8_t* entry = §or[446 + p * 16];
uint8_t type = entry[4];
if (type != 0) {
uint32_t start = *(uint32_t*)&entry[8];
uint32_t size = *(uint32_t*)&entry[12];
kprintf(" Partition %d: type=0x%02x, start=%u, size=%u\n",
p, type, start, size);
}
}
} else if (memcmp(§or[512-2], "EFI PART", 8) == 0) {
kprintf(" Found GPT partition table\n");
} else {
kprintf(" Unknown partition format\n");
}
} else {
kprintf(" Read failed!\n");
}
}
}
Exercises
Exercise 1: Implement lspci
Create a command that lists all PCI devices with details:
void cmd_lspci(void) {
// TODO: Enumerate all PCI devices
// Print: Bus:Slot.Func VendorID:DeviceID Class Description
// Example: 00:1f.2 8086:a102 0106 Intel AHCI Controller
// Hint: Create a table of known vendor/device IDs
}
void ahci_interrupt_handler(int irq) {
uint32_t is = hba->is; // Global interrupt status
for (int port = 0; port < 32; port++) {
if (!(is & (1 << port))) continue;
ahci_port_t* p = get_port(port);
uint32_t pis = p->is; // Port interrupt status
// TODO: Handle these events:
// - Device connected (PRCS bit)
// - Device disconnected
// - Command completion
// - Error conditions
p->is = pis; // Clear handled interrupts
}
hba->is = is; // Clear global status
}
Hot-PlugInterrupts
Exercise 4: NVMe Multiple Queues
Create per-CPU I/O queues for maximum parallelism:
typedef struct nvme_queue_pair {
nvme_sqe_t* sq; // Submission queue
nvme_cqe_t* cq; // Completion queue
uint16_t sq_tail; // Next slot to write
uint16_t cq_head; // Next slot to read
uint8_t cq_phase; // Expected phase bit
spinlock_t lock;
} nvme_queue_pair_t;
// TODO: Create one queue pair per CPU
// - nvme_init_percpu_queues()
// - Use current CPU's queue for submissions
// - No locking needed if each CPU uses its own queue!
SMPNVMe
╔═════════════════════════════════════════════════════════════════════════════╗║ PHASE 15 → PHASE 16 TRANSITION ║╠═════════════════════════════════════════════════════════════════════════════╣║║║✓ Phase 15 Complete:║║ • PCI bus enumeration and device discovery ║║ • Pluggable driver framework with ID matching ║║ • AHCI driver for SATA devices ║║ • NVMe driver for modern SSDs ║║ • Generic block device abstraction ║║║║→ Phase 16 Preview: Performance & Optimization║║ • Scheduler tuning (time slice, priority algorithms) ║║ • Block cache and buffer management ║║ • Memory allocator optimization ║║ • Profiling and bottleneck identification ║║║╚═════════════════════════════════════════════════════════════════════════════╝
Concepts Covered
Concept
Description
PCI Configuration Space
256/4096-byte device descriptor with IDs, BARs, capabilities
BAR Decoding
Determining memory/IO base addresses and sizes
Driver Matching
Vendor/Device ID tables for automatic driver selection
Device Tree
Hierarchical representation of hardware topology
AHCI
SATA interface with command lists and FIS structures
NVMe Queues
Submission/Completion queue pairs for parallel I/O
Block Device
Generic interface abstracting storage hardware
Next Steps
With all major subsystems in place, it's time to optimize. In Phase 16, we'll tune the scheduler, implement caching strategies, and profile performance to make the OS fast and responsive.