Introduction: Running Programs
Phase 9 Goals: By the end of this phase, your kernel will load and execute ELF binaries. You'll understand the ELF format, parse headers, load segments into memory, and jump to entry points to run real compiled programs.
Phase 0: Orientation & Big Picture
OS fundamentals, kernel architectures, learning path
Phase 1: How a Computer Starts
BIOS/UEFI, boot sequence, dev environment
Phase 2: Real Mode - First Steps
Real mode, bootloader, BIOS interrupts
Phase 3: Entering Protected Mode
GDT, 32-bit mode, C code execution
Phase 4: Display, Input & Output
VGA text mode, keyboard handling
Phase 5: Interrupts & CPU Control
IDT, ISRs, PIC programming
Phase 6: Memory Management
Paging, virtual memory, heap allocator
Phase 7: Disk Access & Filesystems
Block devices, FAT, VFS layer
Phase 8: Processes & User Mode
Task switching, system calls, user space
10
Phase 9: ELF Loading & Executables
ELF format, program loading
You Are Here
11
Phase 10: Standard Library & Shell
C library, command-line shell
12
Phase 11: 64-Bit Long Mode
x86-64, 64-bit paging, modern architecture
13
Phase 12: Modern Booting with UEFI
UEFI boot services, memory maps
14
Phase 13: Graphics & GUI Systems
Framebuffer, windowing, drawing
15
Phase 14: Advanced Input & Timing
Mouse, high-precision timers
16
Phase 15: Hardware Discovery & Drivers
PCI, device drivers, NVMe
17
Phase 16: Performance & Optimization
Caching, scheduler tuning
18
Phase 17: Stability, Security & Finishing
Debugging, hardening, completion
In Phase 8, we created processes with embedded bytecode—the program was hardcoded directly into the kernel. That's like a restaurant where you can only order what the chef decided to make that morning. Real operating systems let you run any program—games, text editors, compilers, whatever you want. The magic that makes this possible is an executable format.
/*
* THE JOURNEY FROM SOURCE CODE TO RUNNING PROGRAM
* ================================================
*
* hello.c hello.o hello (ELF)
* ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
* │ #include... │ gcc │ .text │ ld │ ELF Header │
* │ │───────>│ .data │───────>│ Prog Headers│
* │ int main() │ -c │ .bss │ │ .text │
* │ { │ │ .rodata │ │ .data │
* │ printf(); │ │ Relocations │ │ .bss │
* │ } │ │ Symbols │ │ Entry Point │
* └─────────────┘ └─────────────┘ └─────────────┘
* Source Code Object File Executable
*
*
* WHAT THE OS LOADER DOES:
* ┌──────────────────────────────────────────────────────────┐
* │ 1. Read ELF header → Validate magic number │
* │ 2. Parse program headers → Find loadable segments │
* │ 3. Allocate memory → Map pages for each segment │
* │ 4. Copy code/data → Load from file to memory │
* │ 5. Zero BSS → Initialize uninitialized data │
* │ 6. Setup stack → Prepare argc, argv, environment │
* │ 7. Jump to entry → Start executing at e_entry │
* └──────────────────────────────────────────────────────────┘
*/
Key Insight: ELF (Executable and Linkable Format) is the standard executable format on Unix-like systems. Understanding ELF lets your OS run programs compiled with standard toolchains like GCC.
Every operating system needs a way to package programs. An executable format defines how code, data, and metadata are organized in a file. Think of it like a shipping container with a manifest—the format tells the loader where everything is.
Executable Format Evolution
| Format |
Era |
Used By |
Characteristics |
| a.out |
1970s |
Early Unix |
Simple, fixed layout, limited |
| COFF |
1980s |
System V, early Windows |
Sections, symbols, debug info |
| PE (Portable Executable) |
1993 |
Windows (.exe, .dll) |
DOS stub, COFF-based, imports |
| Mach-O |
1989 |
macOS, iOS |
Fat binaries, LC commands |
| ELF |
1995 |
Linux, BSD, Solaris, etc. |
Flexible, extensible, standard |
Real-World Analogy: Think of executable formats like book formats. A physical book, an ebook, and an audiobook all contain "The Lord of the Rings," but they're packaged differently. Each reader (Kindle, Audible, your eyes) expects a specific format. Similarly, Windows expects PE files, macOS expects Mach-O, and Linux expects ELF.
Why ELF?
ELF (Executable and Linkable Format) became the standard for Unix-like systems because it solves real problems elegantly:
/*
* ELF ADVANTAGES OVER OLDER FORMATS
* ==================================
*
* 1. FLEXIBLE LAYOUT
* ┌─────────────────────────────────────────────┐
* │ Header can point anywhere in file │
* │ Sections/segments in any order │
* │ No fixed offsets -> easy to extend │
* └─────────────────────────────────────────────┘
*
* 2. DUAL VIEW (Linking vs Execution)
* ┌───────────────┬───────────────┐
* │ LINK VIEW │ EXEC VIEW │
* │ (Sections) │ (Segments) │
* ├───────────────┼───────────────┤
* │ .text │ LOAD (RX) │
* │ .rodata │ │
* ├───────────────┼───────────────┤
* │ .data │ LOAD (RW) │
* │ .bss │ │
* └───────────────┴───────────────┘
*
* 3. SUPPORTS EVERYTHING
* - Static executables (ET_EXEC)
* - Relocatable objects (ET_REL)
* - Shared libraries (ET_DYN)
* - Core dumps (ET_CORE)
* - Multiple architectures (x86, ARM, MIPS, RISC-V, ...)
*/
Two Views of ELF: ELF has separate "views"—the linking view (sections for compilers/linkers) and the execution view (segments for the OS loader). We only need the execution view to run programs. Segments tell us what to load where; sections are optional metadata for debuggers.
What You'll Build: By the end of this phase, you'll be able to compile a C program on your development machine, copy the resulting ELF binary to your OS's filesystem, and run it. Your homemade operating system will execute real compiled programs!
ELF File Structure
An ELF file is like a well-organized filing cabinet. At the front is an index (the ELF header) that tells you where to find everything else. The file can contain program headers (for loading), section headers (for linking/debugging), and the actual code and data.
/*
* ELF FILE LAYOUT
* ================
*
* ┌───────────────────────────────────────────────────────────┐
* │ ELF HEADER │
* │ Magic: 0x7F 'E' 'L' 'F' │
* │ Entry point, header table offsets, flags │
* ├───────────────────────────────────────────────────────────┤
* │ PROGRAM HEADERS │
* │ (Array of Elf32_Phdr structures) │
* │ Describe segments for loading into memory │
* │ ┌─────────────────────────────────────────────────────┐ │
* │ │ PT_LOAD: .text + .rodata (RX) at 0x08048000 │ │
* │ │ PT_LOAD: .data + .bss (RW) at 0x0804C000 │ │
* │ │ PT_INTERP: /lib/ld-linux.so.2 (dynamic only) │ │
* │ └─────────────────────────────────────────────────────┘ │
* ├───────────────────────────────────────────────────────────┤
* │ .text │
* │ Executable code (your main(), functions, etc.) │
* ├───────────────────────────────────────────────────────────┤
* │ .rodata │
* │ Read-only data (string literals, constants) │
* ├───────────────────────────────────────────────────────────┤
* │ .data │
* │ Initialized global/static variables │
* ├───────────────────────────────────────────────────────────┤
* │ .bss │
* │ Uninitialized globals (not stored, just size noted) │
* ├───────────────────────────────────────────────────────────┤
* │ SECTION HEADERS │
* │ (Array of Elf32_Shdr structures) │
* │ Metadata for linker, debugger, tools │
* └───────────────────────────────────────────────────────────┘
*/
The ELF header is always at offset 0 in the file. It's exactly 52 bytes for 32-bit ELF. Every field has a purpose:
/* ELF32 Header */
typedef struct {
uint8_t e_ident[16]; // Magic number and other info
uint16_t e_type; // Object file type
uint16_t e_machine; // Architecture
uint32_t e_version; // Object file version
uint32_t e_entry; // Entry point virtual address
uint32_t e_phoff; // Program header table offset
uint32_t e_shoff; // Section header table offset
uint32_t e_flags; // Processor-specific flags
uint16_t e_ehsize; // ELF header size
uint16_t e_phentsize; // Program header table entry size
uint16_t e_phnum; // Program header table entry count
uint16_t e_shentsize; // Section header table entry size
uint16_t e_shnum; // Section header table entry count
uint16_t e_shstrndx; // Section name string table index
} Elf32_Ehdr;
// ELF magic number
#define ELF_MAGIC 0x464C457F // "\x7FELF" in little endian
// e_type values
#define ET_NONE 0 // No file type
#define ET_REL 1 // Relocatable file
#define ET_EXEC 2 // Executable file
#define ET_DYN 3 // Shared object file
#define ET_CORE 4 // Core file
// e_machine for i386
#define EM_386 3
ELF Header Field Breakdown
| Field |
Size |
Purpose |
e_ident[0-3] |
4 |
Magic: 0x7F, 'E', 'L', 'F' |
e_ident[4] |
1 |
Class: 1=32-bit, 2=64-bit |
e_ident[5] |
1 |
Endianness: 1=little, 2=big |
e_type |
2 |
File type: REL, EXEC, DYN, CORE |
e_machine |
2 |
Architecture: 3=i386, 62=AMD64 |
e_entry |
4 |
Virtual address of entry point (_start) |
e_phoff |
4 |
Program header table file offset |
e_phnum |
2 |
Number of program headers |
Examining Real ELF Files: Use readelf -h on any Linux executable to see the header:
# Examine a real ELF header
$ readelf -h /bin/ls
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
Type: DYN (Position-Independent Executable file)
Machine: Advanced Micro Devices X86-64
Entry point address: 0x6aa0
Start of program headers: 64 (bytes into file)
Number of program headers: 13
...
# View the raw bytes of an ELF header
$ hexdump -C /bin/ls | head -4
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 03 00 3e 00 01 00 00 00 a0 6a 00 00 00 00 00 00 |..>......j......|
00000020 40 00 00 00 00 00 00 00 88 2a 02 00 00 00 00 00 |@........*......|
00000030 00 00 00 00 40 00 38 00 0d 00 40 00 1e 00 1d 00 |....@.8...@.....|
Program headers describe segments—contiguous chunks of the file to load into memory. For execution, we only care about PT_LOAD segments. Each header tells us:
- Where to find the segment in the file (
p_offset)
- Where to load it in memory (
p_vaddr)
- How many bytes to copy (
p_filesz) and total memory needed (p_memsz)
- What permissions to set (
p_flags: read, write, execute)
/* ELF32 Program Header */
typedef struct {
uint32_t p_type; // Segment type
uint32_t p_offset; // Segment file offset
uint32_t p_vaddr; // Segment virtual address
uint32_t p_paddr; // Segment physical address
uint32_t p_filesz; // Segment size in file
uint32_t p_memsz; // Segment size in memory
uint32_t p_flags; // Segment flags
uint32_t p_align; // Segment alignment
} Elf32_Phdr;
// Segment types
#define PT_NULL 0 // Unused
#define PT_LOAD 1 // Loadable segment
#define PT_DYNAMIC 2 // Dynamic linking info
#define PT_INTERP 3 // Interpreter pathname
#define PT_NOTE 4 // Auxiliary information
#define PT_PHDR 6 // Program header table
// Segment flags
#define PF_X 0x1 // Executable
#define PF_W 0x2 // Writable
#define PF_R 0x4 // Readable
The BSS Trick: When p_memsz > p_filesz, the extra bytes are BSS (uninitialized data). We don't store zeros in the file—we just note how many bytes to zero at load time. A program with a large uninitialized array doesn't bloat the executable file.
/*
* UNDERSTANDING PT_LOAD SEGMENTS
* ===============================
*
* File: Memory:
* ┌────────────────────┐ ┌────────────────────┐ 0x08048000
* │ .text (code) │ ────────→ │ .text (RX) │
* │ 0x1000 bytes │ │ 0x1000 bytes │
* ├────────────────────┤ ├────────────────────┤ 0x08049000
* │ .rodata (strings) │ ────────→ │ .rodata (R-) │
* │ 0x500 bytes │ │ 0x500 bytes │
* ├────────────────────┤ ├────────────────────┤
* │ (padding to page) │
* ├────────────────────┤ 0x0804A000
* │ .data (init vars) │ ────────→ │ .data (RW) │
* │ 0x100 bytes │ │ 0x100 bytes │
* └────────────────────┘ ├────────────────────┤
* │ .bss (zeroed) │ ← Not in file!
* │ 0x2000 bytes │ Just zeroed
* └────────────────────┘ at load time
*
* Typical program has 2 PT_LOAD segments:
* 1. Code segment (RX): .text + .rodata
* 2. Data segment (RW): .data + .bss
*/
Section headers provide a linking view—detailed information for linkers, debuggers, and tools. For loading executables, we can ignore them entirely! But they're useful to understand:
/* ELF32 Section Header */
typedef struct {
uint32_t sh_name; // Section name (index into string table)
uint32_t sh_type; // Section type
uint32_t sh_flags; // Section flags
uint32_t sh_addr; // Virtual address in memory
uint32_t sh_offset; // Offset in file
uint32_t sh_size; // Size of section
uint32_t sh_link; // Link to another section
uint32_t sh_info; // Additional information
uint32_t sh_addralign; // Alignment
uint32_t sh_entsize; // Entry size if section holds table
} Elf32_Shdr;
// Common section types
#define SHT_NULL 0 // Inactive
#define SHT_PROGBITS 1 // Program data (.text, .data, .rodata)
#define SHT_SYMTAB 2 // Symbol table
#define SHT_STRTAB 3 // String table
#define SHT_NOBITS 8 // No file data (.bss)
Common ELF Sections
| Section |
Type |
Contents |
.text |
PROGBITS |
Executable machine code |
.rodata |
PROGBITS |
Read-only data (strings, constants) |
.data |
PROGBITS |
Initialized writable data |
.bss |
NOBITS |
Uninitialized data (zeroed) |
.symtab |
SYMTAB |
Symbol table (functions, variables) |
.strtab |
STRTAB |
String table (symbol names) |
.shstrtab |
STRTAB |
Section name strings |
Key Difference: Sections are for tools (compilers, linkers, debuggers). Segments are for the OS loader. An ELF file always has a header, usually has program headers (for executables), and optionally has section headers (can be stripped).
Parsing ELF Files
Before we can load a program, we need to verify it's a valid ELF file for our architecture. This is called validation. Then we iterate through the program headers to find loadable segments.
Header Validation
ELF validation is crucial for security and stability. A malformed ELF file could crash the kernel or worse—be a deliberate attack. We check:
- Magic number: First 4 bytes must be
0x7F, 'E', 'L', 'F'
- Class: Must be 32-bit (we're on i386)
- Endianness: Must be little-endian (x86 standard)
- File type: Must be ET_EXEC (executable)
- Architecture: Must be EM_386 (Intel 80386)
/* Validate ELF header */
bool elf_validate(Elf32_Ehdr* header) {
// Check magic number
if (*(uint32_t*)header->e_ident != ELF_MAGIC) {
return false;
}
// Check class (32-bit)
if (header->e_ident[4] != 1) { // ELFCLASS32
return false;
}
// Check data encoding (little endian)
if (header->e_ident[5] != 1) { // ELFDATA2LSB
return false;
}
// Check file type (executable)
if (header->e_type != ET_EXEC) {
return false;
}
// Check machine type (i386)
if (header->e_machine != EM_386) {
return false;
}
return true;
}
Security Note: In a real OS, you'd also check that the entry point falls within a valid segment, that addresses don't overflow, and that the file isn't truncated. Never trust user-supplied data!
Segment Loading
Once validated, we iterate through program headers looking for PT_LOAD segments. Each one needs memory allocated and data copied:
/*
* SEGMENT LOADING PROCESS
* ========================
*
* For each PT_LOAD segment:
*
* ELF File Process Memory
* ┌──────────────────┐ ┌──────────────────┐
* │ │ │ │
* │ p_offset ───────┼──────┐ │ │
* │ │ │ │ │
* ├──────────────────┤ │ ├──────────────────┤ p_vaddr
* │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓│ └──────>│▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓│ ← memcpy
* │▓▓▓ p_filesz ▓▓▓▓▓│ │▓▓▓ code/data ▓▓▓▓│
* │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓│ │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓│
* ├──────────────────┤ ├──────────────────┤ p_vaddr + p_filesz
* │ │ │░░░░░░░░░░░░░░░░░░│ ← memset 0
* │ (not in file) │ │░░░░ BSS ░░░░░░░░░│ (if memsz > filesz)
* │ │ │░░░░░░░░░░░░░░░░░░│
* └──────────────────┘ └──────────────────┘ p_vaddr + p_memsz
*
* Steps:
* 1. Calculate number of pages needed
* 2. Allocate physical frames
* 3. Map pages at p_vaddr with correct permissions
* 4. Copy p_filesz bytes from file
* 5. Zero remaining (p_memsz - p_filesz) bytes
*/
/* Process a single PT_LOAD segment */
void load_segment(Elf32_Phdr* phdr, uint8_t* file_data, uint32_t* page_dir) {
uint32_t vaddr_start = phdr->p_vaddr & ~0xFFF; // Page-align down
uint32_t vaddr_end = (phdr->p_vaddr + phdr->p_memsz + 0xFFF) & ~0xFFF;
// Determine page flags
uint32_t flags = PAGE_PRESENT | PAGE_USER;
if (phdr->p_flags & PF_W) {
flags |= PAGE_WRITE;
}
// Note: x86 page tables don't have execute bit in 32-bit mode
// (NX bit requires PAE or 64-bit)
// Allocate and map pages
for (uint32_t vaddr = vaddr_start; vaddr < vaddr_end; vaddr += 0x1000) {
uint32_t frame = alloc_frame();
map_page(page_dir, vaddr, frame, flags);
}
// Copy segment data from file
memcpy((void*)phdr->p_vaddr,
file_data + phdr->p_offset,
phdr->p_filesz);
// Zero BSS portion (memsz > filesz)
if (phdr->p_memsz > phdr->p_filesz) {
memset((void*)(phdr->p_vaddr + phdr->p_filesz),
0,
phdr->p_memsz - phdr->p_filesz);
}
}
Program Loader
The program loader is the component that takes an ELF file and transforms it into a running process. It combines everything we've built: memory management (Phase 6), filesystem (Phase 7), and process infrastructure (Phase 8).
Memory Setup
Each process needs its own address space. We create a new page directory and map the ELF segments into it. The memory layout follows Unix conventions:
/*
* TYPICAL USER PROCESS ADDRESS SPACE
* ====================================
*
* 0xFFFFFFFF ┌─────────────────────────────────────┐
* │ │
* │ Kernel Space (Not Accessible │
* │ from User Mode - Page Fault) │
* │ │
* 0xC0000000 ├─────────────────────────────────────┤
* │ (Reserved/Unmapped) │
* 0xBFFFF000 ├─────────────────────────────────────┤
* │ ↓ User Stack (grows down) │
* │ [argc][argv ptrs][env ptrs] │
* │ [actual strings...] │
* │ │
* ├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┤
* │ (unmapped - stack guard) │
* │ │
* │ ... │
* │ │
* ├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─┤
* │ ↑ Heap (grows up via sbrk) │
* 0x0804C000 ├─────────────────────────────────────┤
* │ .bss (zeroed) │
* │ .data (RW) │
* 0x0804A000 ├─────────────────────────────────────┤
* │ .rodata (R-) │
* │ .text (RX) │
* 0x08048000 ├─────────────────────────────────────┤ ← Typical start
* │ (unmapped - null guard) │
* 0x00000000 └─────────────────────────────────────┘
*/
/* Create address space for new process */
uint32_t* create_user_address_space(void) {
// Allocate page directory
uint32_t* page_dir = (uint32_t*)alloc_frame();
memset(page_dir, 0, 4096);
// Map kernel space (upper 1GB) - shared across all processes
// This lets syscalls work without switching page directories
for (int i = 768; i < 1024; i++) {
page_dir[i] = kernel_page_dir[i];
}
// Self-reference for recursive mapping trick
page_dir[1023] = (uint32_t)page_dir | PAGE_PRESENT | PAGE_WRITE;
return page_dir;
}
Entry Point
The ELF header's e_entry field contains the virtual address where execution should begin. This is typically _start from crt0 (the C runtime startup code), which then calls main().
/* Load ELF executable */
uint32_t elf_load(uint8_t* file_data, uint32_t* page_dir) {
Elf32_Ehdr* header = (Elf32_Ehdr*)file_data;
if (!elf_validate(header)) {
return 0;
}
// Get program header table
Elf32_Phdr* phdrs = (Elf32_Phdr*)(file_data + header->e_phoff);
// Load each PT_LOAD segment
for (int i = 0; i < header->e_phnum; i++) {
Elf32_Phdr* phdr = &phdrs[i];
if (phdr->p_type != PT_LOAD) {
continue;
}
// Allocate pages for segment
uint32_t vaddr = phdr->p_vaddr;
uint32_t mem_size = phdr->p_memsz;
uint32_t file_size = phdr->p_filesz;
// Map pages (user-accessible)
uint32_t flags = PAGE_PRESENT | PAGE_USER;
if (phdr->p_flags & PF_W) {
flags |= PAGE_WRITE;
}
for (uint32_t addr = vaddr; addr < vaddr + mem_size; addr += 0x1000) {
uint32_t frame = alloc_frame();
map_page(page_dir, addr, frame, flags);
}
// Copy segment data
memcpy((void*)vaddr, file_data + phdr->p_offset, file_size);
// Zero remaining (BSS)
if (mem_size > file_size) {
memset((void*)(vaddr + file_size), 0, mem_size - file_size);
}
}
return header->e_entry; // Return entry point
}
Why 0x08048000? This traditional Linux start address leaves low memory unmapped as a "null pointer guard." Accessing NULL (address 0) causes a page fault instead of silently corrupting data—a helpful debugging feature!
Stack & Arguments
Before jumping to user code, we need to set up the stack with the expected arguments. C programs expect argc, argv, and envp at specific stack locations:
/*
* USER STACK LAYOUT (at program start)
* =====================================
*
* High addresses
* ┌────────────────────────────────┐
* │ "PATH=/bin:/usr/bin" │ ← Environment strings
* │ "/home/user/hello" │ ← argv[1] string
* │ "./hello" │ ← argv[0] string (program name)
* ├────────────────────────────────┤
* │ NULL │ ← End of envp[]
* │ ptr to "PATH=..." │ ← envp[0]
* ├────────────────────────────────┤
* │ NULL │ ← End of argv[]
* │ ptr to "/home/user/hello" │ ← argv[1]
* │ ptr to "./hello" │ ← argv[0]
* ├────────────────────────────────┤
* SP → │ argc = 2 │ ← Stack pointer here at entry
* └────────────────────────────────┘
* Low addresses
*
* _start in crt0 does:
* pop ecx ; ecx = argc
* mov esi, esp ; esi = argv
* push [esp+ecx*4+4] ; envp
* push esi ; argv
* push ecx ; argc
* call main
*/
/* Setup user stack with arguments */
uint32_t setup_user_stack(uint32_t* page_dir, int argc, char** argv, char** envp) {
// Allocate stack pages (typically 8KB = 2 pages)
uint32_t stack_top = 0xC0000000; // Below kernel space
uint32_t stack_base = stack_top - 0x2000; // 8KB stack
for (uint32_t addr = stack_base; addr < stack_top; addr += 0x1000) {
uint32_t frame = alloc_frame();
map_page(page_dir, addr, frame, PAGE_PRESENT | PAGE_WRITE | PAGE_USER);
}
uint32_t sp = stack_top;
// Copy string data first (at top of stack)
// Then build pointer arrays
// Finally push argc
// 1. Copy environment strings (if any)
int envc = 0;
uint32_t* env_ptrs = NULL;
if (envp) {
while (envp[envc]) envc++;
env_ptrs = kmalloc(sizeof(uint32_t) * (envc + 1));
for (int i = 0; i < envc; i++) {
size_t len = strlen(envp[i]) + 1;
sp -= len;
memcpy((void*)sp, envp[i], len);
env_ptrs[i] = sp;
}
env_ptrs[envc] = 0; // NULL terminator
}
// 2. Copy argument strings
uint32_t* argv_ptrs = kmalloc(sizeof(uint32_t) * (argc + 1));
for (int i = 0; i < argc; i++) {
size_t len = strlen(argv[i]) + 1;
sp -= len;
memcpy((void*)sp, argv[i], len);
argv_ptrs[i] = sp;
}
argv_ptrs[argc] = 0; // NULL terminator
// Align to 4 bytes
sp &= ~0x3;
// 3. Push envp array
for (int i = envc; i >= 0; i--) {
sp -= 4;
*(uint32_t*)sp = env_ptrs ? env_ptrs[i] : 0;
}
// 4. Push argv array
for (int i = argc; i >= 0; i--) {
sp -= 4;
*(uint32_t*)sp = argv_ptrs[i];
}
// 5. Push argc
sp -= 4;
*(uint32_t*)sp = argc;
kfree(argv_ptrs);
if (env_ptrs) kfree(env_ptrs);
return sp; // Return stack pointer for entry
}
Implementing exec()
The exec() system call replaces the current process's memory image with a new program. Unlike fork(), which creates a copy, exec() transforms the process—same PID, new program. This is the Unix model for running programs.
/*
* THE EXEC SYSTEM CALL
* =====================
*
* Before exec():
* ┌─────────────────────────────────────────────────────────────┐
* │ Process 42 (/bin/shell) │
* │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐│
* │ │ shell code │ │ shell data │ │ stack: cmd="./hello" ││
* │ └─────────────┘ └─────────────┘ └─────────────────────────┘│
* │ PID=42, PPID=1, files={stdin,stdout,stderr} │
* └─────────────────────────────────────────────────────────────┘
* │
* │ exec("./hello", {"./hello", NULL})
* ▼
* After exec():
* ┌─────────────────────────────────────────────────────────────┐
* │ Process 42 (./hello) ← Same PID! │
* │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐│
* │ │ hello code │ │ hello data │ │ stack: argc=1,argv ││
* │ └─────────────┘ └─────────────┘ └─────────────────────────┘│
* │ PID=42, PPID=1, files={stdin,stdout,stderr} ← Files kept! │
* └─────────────────────────────────────────────────────────────┘
*
* exec() keeps: PID, PPID, file descriptors, working directory
* exec() replaces: code, data, stack, heap, signal handlers
*/
Fork-Exec Pattern: Unix typically uses fork() + exec() together. The shell forks, the child calls exec to run the command, and the parent waits. This separation allows the child to set up redirections, pipes, and environment before exec.
/* Execute a program */
int sys_exec(const char* path, char* const argv[]) {
// Read file from filesystem
vfs_node_t* file = vfs_open(path);
if (!file) {
return -1;
}
// Read entire file into memory
uint32_t size = file->length;
uint8_t* data = kmalloc(size);
vfs_read(file, 0, size, data);
vfs_close(file);
// Create new address space
uint32_t* new_page_dir = create_page_directory();
// Load ELF
uint32_t entry = elf_load(data, new_page_dir);
kfree(data);
if (!entry) {
free_page_directory(new_page_dir);
return -1;
}
// Setup user stack
uint32_t user_stack = 0xBFFFF000;
map_user_stack(new_page_dir, user_stack);
// Copy arguments to user stack
int argc = count_args(argv);
uint32_t sp = setup_args(user_stack, argc, argv);
// Switch address space
switch_page_directory(new_page_dir);
// Jump to user mode
enter_user_mode(entry, sp);
// Never reaches here
return 0;
}
The exec() Family
| Function |
Path Resolution |
Arguments |
Environment |
execl |
Exact path |
List |
Inherited |
execv |
Exact path |
Array |
Inherited |
execle |
Exact path |
List |
Explicit |
execve |
Exact path |
Array |
Explicit |
execlp |
Search PATH |
List |
Inherited |
execvp |
Search PATH |
Array |
Inherited |
l=list arguments, v=vector(array), e=explicit environment, p=PATH search
Error Handling: If exec() succeeds, it never returns—the new program simply starts running. If it fails (file not found, not executable, permission denied), it returns -1 and sets errno. The calling code must handle this:
/* Example: Shell executing a command */
void shell_exec_command(char* path, char** argv) {
pid_t pid = fork();
if (pid == 0) {
// Child process
execv(path, argv);
// If we get here, exec failed!
printf("exec failed: %s\n", strerror(errno));
exit(127); // Standard exit code for "command not found"
} else if (pid > 0) {
// Parent: wait for child
int status;
waitpid(pid, &status, 0);
if (WIFEXITED(status)) {
printf("Program exited with code %d\n", WEXITSTATUS(status));
}
} else {
// fork() failed
printf("fork failed\n");
}
}
What You Can Build
Phase 9 Project: An OS that runs real programs! Compile C programs with GCC, produce ELF binaries, and your kernel loads and executes them. You can now run programs written by others on your OS.
Project 1: Simple ELF Loader
Build a complete ELF loader that can run statically linked programs:
/* Complete ELF loader implementation */
#include "elf.h"
#include "paging.h"
#include "process.h"
#include "fs.h"
/* Main ELF loading function */
int elf_exec(const char* path, int argc, char** argv) {
// 1. Read file from filesystem
vfs_node_t* file = vfs_open(path);
if (!file) {
return -ENOENT;
}
// 2. Read ELF header
Elf32_Ehdr ehdr;
if (vfs_read(file, 0, sizeof(ehdr), &ehdr) != sizeof(ehdr)) {
vfs_close(file);
return -EIO;
}
// 3. Validate header
if (!elf_validate(&ehdr)) {
vfs_close(file);
return -ENOEXEC;
}
// 4. Create new address space
uint32_t* new_pd = create_user_address_space();
// 5. Read and process program headers
size_t phdr_size = ehdr.e_phentsize * ehdr.e_phnum;
Elf32_Phdr* phdrs = kmalloc(phdr_size);
vfs_read(file, ehdr.e_phoff, phdr_size, phdrs);
for (int i = 0; i < ehdr.e_phnum; i++) {
if (phdrs[i].p_type == PT_LOAD) {
// Read segment from file
uint8_t* segment_data = kmalloc(phdrs[i].p_filesz);
vfs_read(file, phdrs[i].p_offset, phdrs[i].p_filesz, segment_data);
// Load segment into address space (temporarily switch PD)
load_segment(&phdrs[i], segment_data, new_pd);
kfree(segment_data);
}
}
kfree(phdrs);
vfs_close(file);
// 6. Setup user stack with arguments
uint32_t user_sp = setup_user_stack(new_pd, argc, argv, NULL);
// 7. Update current process
pcb_t* proc = get_current_process();
// Free old address space
if (proc->page_directory != kernel_page_directory) {
free_page_directory(proc->page_directory);
}
proc->page_directory = new_pd;
// 8. Switch to new address space and jump to entry
switch_page_directory(new_pd);
enter_user_mode(ehdr.e_entry, user_sp);
// Never reached
return 0;
}
Project 2: readelf Command
Build a user-space tool that displays ELF file information (like the Linux readelf command):
/* User-space readelf utility */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "elf.h"
void print_elf_header(Elf32_Ehdr* ehdr) {
printf("ELF Header:\n");
printf(" Magic: ");
for (int i = 0; i < 16; i++) {
printf("%02x ", ehdr->e_ident[i]);
}
printf("\n");
printf(" Class: ");
printf("%s\n", ehdr->e_ident[4] == 1 ? "ELF32" : "ELF64");
printf(" Data: ");
printf("%s\n", ehdr->e_ident[5] == 1 ? "little endian" : "big endian");
const char* types[] = {"NONE", "REL", "EXEC", "DYN", "CORE"};
printf(" Type: %s\n",
ehdr->e_type < 5 ? types[ehdr->e_type] : "UNKNOWN");
printf(" Entry point address: 0x%x\n", ehdr->e_entry);
printf(" Start of program headers: %d (bytes into file)\n", ehdr->e_phoff);
printf(" Number of program headers: %d\n", ehdr->e_phnum);
printf(" Start of section headers: %d (bytes into file)\n", ehdr->e_shoff);
printf(" Number of section headers: %d\n", ehdr->e_shnum);
}
void print_program_headers(FILE* f, Elf32_Ehdr* ehdr) {
printf("\nProgram Headers:\n");
printf(" Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align\n");
fseek(f, ehdr->e_phoff, SEEK_SET);
const char* ptypes[] = {"NULL", "LOAD", "DYNAMIC", "INTERP", "NOTE",
"SHLIB", "PHDR", "TLS"};
for (int i = 0; i < ehdr->e_phnum; i++) {
Elf32_Phdr phdr;
fread(&phdr, sizeof(phdr), 1, f);
printf(" %-14s 0x%06x 0x%08x 0x%08x 0x%05x 0x%05x %c%c%c 0x%x\n",
phdr.p_type < 8 ? ptypes[phdr.p_type] : "UNKNOWN",
phdr.p_offset, phdr.p_vaddr, phdr.p_paddr,
phdr.p_filesz, phdr.p_memsz,
phdr.p_flags & PF_R ? 'R' : ' ',
phdr.p_flags & PF_W ? 'W' : ' ',
phdr.p_flags & PF_X ? 'E' : ' ',
phdr.p_align);
}
}
int main(int argc, char** argv) {
if (argc < 2) {
printf("Usage: readelf \n");
return 1;
}
FILE* f = fopen(argv[1], "rb");
if (!f) {
printf("Cannot open %s\n", argv[1]);
return 1;
}
Elf32_Ehdr ehdr;
fread(&ehdr, sizeof(ehdr), 1, f);
if (*(uint32_t*)ehdr.e_ident != ELF_MAGIC) {
printf("Not an ELF file\n");
fclose(f);
return 1;
}
print_elf_header(&ehdr);
print_program_headers(f, &ehdr);
fclose(f);
return 0;
}
Project 3: Compile Test Programs
Create simple test programs to verify your loader works. Use a cross-compiler targeting your OS:
# Create a minimal C runtime (crt0.S)
cat > crt0.S << 'EOF'
.section .text
.global _start
_start:
# Standard System V i386 ABI startup
# Stack: [argc][argv0][argv1]...[NULL][envp0]...[NULL]
xor %ebp, %ebp # Clear frame pointer (end of stack trace)
pop %ecx # argc
mov %esp, %esi # argv
# Calculate envp = argv + (argc + 1) * 4
lea 4(%esi,%ecx,4), %edx
# Align stack to 16 bytes (ABI requirement)
and $0xFFFFFFF0, %esp
push %edx # envp
push %esi # argv
push %ecx # argc
call main # Call main(argc, argv, envp)
push %eax # Exit code
call exit # exit(return value)
# If exit returns (shouldn't), halt
hlt
EOF
# Create a minimal syscall wrapper
cat > syscalls.c << 'EOF'
static inline int syscall1(int num, int arg1) {
int ret;
asm volatile ("int $0x80" : "=a"(ret) : "a"(num), "b"(arg1));
return ret;
}
static inline int syscall3(int num, int arg1, int arg2, int arg3) {
int ret;
asm volatile ("int $0x80" : "=a"(ret) : "a"(num), "b"(arg1), "c"(arg2), "d"(arg3));
return ret;
}
void exit(int status) {
syscall1(1, status); // SYS_EXIT = 1
while(1);
}
int write(int fd, const void* buf, int count) {
return syscall3(4, fd, (int)buf, count); // SYS_WRITE = 4
}
void puts(const char* s) {
int len = 0;
while (s[len]) len++;
write(1, s, len);
write(1, "\n", 1);
}
EOF
# Write a test program
cat > hello.c << 'EOF'
void puts(const char* s);
void exit(int status);
int main(int argc, char** argv) {
puts("Hello from user space!");
puts("Arguments:");
for (int i = 0; i < argc; i++) {
puts(argv[i]);
}
return 42; // Exit code
}
EOF
# Compile everything (using cross-compiler)
i686-elf-as -o crt0.o crt0.S
i686-elf-gcc -c -ffreestanding -O2 -o syscalls.o syscalls.c
i686-elf-gcc -c -ffreestanding -O2 -o hello.o hello.c
# Link into static ELF executable
i686-elf-ld -T link.ld -o hello crt0.o syscalls.o hello.o
# Verify it's a valid ELF
i686-elf-readelf -h hello
Exercises
Exercise 1: Dynamic Linker Stub
Goal: Detect dynamically linked executables and print an error message.
- Check for
PT_INTERP segment in program headers
- If found, read the interpreter path (e.g., "/lib/ld-linux.so.2")
- Return
-ENOEXEC with message "Dynamic linking not supported"
- Later: Implement basic dynamic linking (Phase 10+)
Exercise 2: Shebang Support
Goal: Allow scripts to be executed directly.
- Check if file starts with
#! (shebang)
- Parse the interpreter path (e.g.,
#!/bin/sh)
- Recursively exec the interpreter with script as argument
- Handle arguments on shebang line (e.g.,
#!/usr/bin/env python)
int exec_or_script(const char* path, char** argv) {
// Read first 2 bytes
char magic[2];
read_file(path, magic, 2);
if (magic[0] == '#' && magic[1] == '!') {
// Parse interpreter from first line
char interp[256];
read_shebang_line(path, interp);
// Build new argv: [interpreter, script, original args...]
char** new_argv = build_script_argv(interp, path, argv);
return elf_exec(interp, count_args(new_argv), new_argv);
}
// Regular ELF
return elf_exec(path, count_args(argv), argv);
}
Exercise 3: Position-Independent Executables (PIE)
Goal: Support executables that can load at any address.
- Detect
ET_DYN type (used for PIE and shared libs)
- Choose random base address (ASLR)
- Add base offset to all segment virtual addresses
- Adjust entry point by same offset
Why: Modern Linux compiles most programs as PIE for security (ASLR makes exploits harder).
Exercise 4: ELF Section Viewer
Goal: Extend readelf to show section headers and symbol tables.
- Parse section header table at
e_shoff
- Read section name string table (
e_shstrndx)
- Display section names, types, addresses, sizes
- Parse
.symtab to show function/variable names
# Your readelf output should look like:
$ readelf -S hello
Section Headers:
[Nr] Name Type Addr Off Size
[ 0] NULL 00000000 000000 000000
[ 1] .text PROGBITS 08048000 001000 000234
[ 2] .rodata PROGBITS 08048234 001234 000048
[ 3] .data PROGBITS 08049000 002000 000010
[ 4] .bss NOBITS 08049010 002010 000100
[ 5] .symtab SYMTAB 00000000 002010 000180
[ 6] .strtab STRTAB 00000000 002190 000090
Next Steps
With program loading working, your OS can now run real compiled programs! But those programs need a way to interact with the system—they need a standard library. And we need a shell to manage running programs interactively.
Phase 9 Achievements:
- Understand the ELF executable format (headers, segments, sections)
- Parse and validate ELF files for security
- Load program segments into memory with correct permissions
- Setup user stack with argc, argv, envp
- Implement the exec() system call
- Run real GCC-compiled programs on your OS!
/*
* PHASE 10 PREVIEW: STANDARD LIBRARY & SHELL
* ===========================================
*
* Right now, programs must use raw syscalls. Phase 10 adds:
*
* 1. MINIMAL C LIBRARY
* ┌────────────────────────────────────────────────────────┐
* │ stdio.h: printf(), scanf(), fopen(), fclose(), ... │
* │ stdlib.h: malloc(), free(), exit(), atoi(), ... │
* │ string.h: strlen(), strcpy(), memcpy(), strcmp(), ... │
* │ unistd.h: read(), write(), fork(), exec(), pipe() │
* └────────────────────────────────────────────────────────┘
*
* 2. COMMAND-LINE SHELL
* ╔════════════════════════════════════════════════════════╗
* ║ MyOS Shell v1.0 ║
* ║ $ ls ║
* ║ bin/ home/ etc/ ║
* ║ $ cat hello.c ║
* ║ int main() { puts("Hello!"); return 0; } ║
* ║ $ ./hello ║
* ║ Hello! ║
* ║ $ echo $? ║
* ║ 0 ║
* ╚════════════════════════════════════════════════════════╝
*
* 3. BASIC UTILITIES
* - ls: list directory contents
* - cat: display file contents
* - echo: print arguments
* - cd: change directory
* - pwd: print working directory
* - ps: list processes
* - kill: send signals
*
* Your OS will finally feel like a real Unix system!
*/
Key Takeaways
- ELF is elegant: A single format for executables, objects, and libraries. The dual-view design (linking vs execution) keeps it flexible.
- Segments matter, sections optional: For loading programs, we only need program headers. Section headers are for tools.
- BSS saves space: Zero-initialized data isn't stored in the file—we just note the size and zero it at load time.
- exec() transforms: It replaces the current process image, keeping PID and file descriptors but creating a fresh address space.
- Cross-compilation is key: You need a compiler that produces ELF binaries for your target architecture and links against your minimal C library.
- Validation matters: Never trust user-supplied data. Validate every field, check bounds, verify addresses.
Continue the Series
Phase 8: Processes & User Mode
Review task switching, system calls, and user mode execution.
Read Article
Phase 10: Standard Library & Shell
Build a minimal C library and implement a command-line shell.
Read Article
Phase 11: 64-Bit Long Mode
Upgrade to x86-64 architecture with 64-bit paging.
Read Article