General-Purpose Registers
Core Concept: General-purpose registers are the CPU's working memory. x86-64 provides 16 64-bit general-purpose registers (RAX-R15) that can be accessed in different sizes for backward compatibility.
The RAX Family (Accumulator)
Register
RAX / EAX / AX / AH / AL
; RAX (64-bit) = |63--------------------32|31---------16|15---8|7---0|
; | EAX | AX | |
; | |AH |AL |
mov rax, 0x123456789ABCDEF0 ; Full 64-bit
mov eax, 0x12345678 ; Lower 32-bit (zeros upper 32)
mov ax, 0x1234 ; Lower 16-bit
mov ah, 0x12 ; High byte of AX
mov al, 0x34 ; Low byte of AX
Complete General-Purpose Set
Reference
x86-64 Register Map
| 64-bit | 32-bit | 16-bit | 8-bit High | 8-bit Low | Traditional Use |
| RAX | EAX | AX | AH | AL | Accumulator |
| RBX | EBX | BX | BH | BL | Base |
| RCX | ECX | CX | CH | CL | Counter |
| RDX | EDX | DX | DH | DL | Data |
| RSI | ESI | SI | - | SIL | Source Index |
| RDI | EDI | DI | - | DIL | Destination Index |
| RBP | EBP | BP | - | BPL | Base Pointer |
| RSP | ESP | SP | - | SPL | Stack Pointer |
| R8-R15 | R8D-R15D | R8W-R15W | - | R8B-R15B | x64 Extended |
Sub-Register Access Gotchas
Understanding how partial register writes behave is crucial to avoid subtle bugs:
The Zero-Extension Rule (64-bit mode)
; CRITICAL RULE: Writing to 32-bit register ZEROS the upper 32 bits!
mov rax, 0xFFFFFFFF_FFFFFFFF ; RAX = full 64-bit value
mov eax, 0x12345678 ; RAX = 0x00000000_12345678 (!)
; But 8-bit and 16-bit writes DO NOT zero-extend:
mov rax, 0xFFFFFFFF_FFFFFFFF ; RAX = full 64-bit value
mov ax, 0x1234 ; RAX = 0xFFFFFFFF_FFFF1234
mov al, 0x56 ; RAX = 0xFFFFFFFF_FFFF1256
Common Bug Source: Forgetting that mov eax, val clears the upper 32 bits of RAX. This is intentional (avoids partial register stalls) but catches beginners. Use mov rax, val or explicit zero-extension when needed.
Performance: Partial Register Stalls
; This code may stall on older CPUs:
mov rax, 0
mov ah, 1 ; Write to partial register (AH)
mov rbx, rax ; Read full register - possible stall!
; Better: Avoid AH/BH/CH/DH in 64-bit code
movzx eax, byte [value] ; Zero-extend to full register
shl eax, 8 ; Shift to "AH position" if needed
REX Prefix Impact
; The high-byte registers (AH, BH, CH, DH) cannot be used
; when any REX prefix is present (which is required for R8-R15)
mov ah, 5 ; OK: no REX needed
mov r8b, 5 ; OK: uses REX prefix
mov ah, r8b ; ERROR: Can't encode AH with REX prefix!
; New low-byte registers (SIL, DIL, BPL, SPL) require REX
mov sil, 5 ; OK: REX prefix generated automatically
Exercise: Predict the Output
xor rax, rax ; RAX = ?
mov eax, 0xDEADBEEF ; RAX = ?
mov ax, 0x1234 ; RAX = ?
mov al, 0xFF ; RAX = ?
; Final RAX value?
Answer: 0x00000000_DEAD12FF — EAX write zeroed upper bits, AX write preserved upper 48 bits, AL write preserved upper 56 bits.
Index & Pointer Registers
These registers have hardware-supported roles in memory addressing and stack operations.
RSP — Stack Pointer
; RSP always points to the TOP of the stack (last pushed value)
; Stack grows DOWNWARD on x86!
push rax ; RSP -= 8, then [RSP] = RAX
pop rbx ; RBX = [RSP], then RSP += 8
; Direct stack manipulation:
sub rsp, 32 ; Reserve 32 bytes on stack
mov [rsp+8], rdi ; Store value in reserved space
add rsp, 32 ; Release reserved space
; CRITICAL: RSP must be 16-byte aligned before CALL on x86-64!
; The ABI expects it. Violating this crashes on some SIMD instructions.
RBP — Base Pointer (Frame Pointer)
; Traditional stack frame setup
my_function:
push rbp ; Save caller's frame pointer
mov rbp, rsp ; Establish our frame
sub rsp, 32 ; Local variables
; Access locals via RBP (constant offset throughout function)
mov [rbp-8], rdi ; First local variable
mov [rbp-16], rsi ; Second local variable
; Access parameters (after return address and saved RBP)
; Stack args (if any) at [rbp+16], [rbp+24], ...
leave ; Equivalent to: mov rsp, rbp; pop rbp
ret
; Frame pointer can be omitted (-fomit-frame-pointer) for more registers
; But debugging becomes harder
RSI & RDI — Source & Destination Index
; Originally for string operations (auto-increment/decrement)
mov rsi, source_buffer
mov rdi, dest_buffer
mov rcx, 100 ; Count
cld ; Clear direction flag (forward)
rep movsb ; Copy RCX bytes from [RSI] to [RDI]
; Also used as first two arguments in System V AMD64 ABI:
; my_func(arg1, arg2) → RDI=arg1, RSI=arg2
Calling Conventions:
- System V AMD64 (Linux, macOS): RDI, RSI, RDX, RCX, R8, R9
- Microsoft x64 (Windows): RCX, RDX, R8, R9
- Return value: RAX (and RDX for 128-bit returns)
Segment Registers
Legacy from segmented memory days, but still relevant for special purposes in 64-bit mode.
Segment Register Overview
| Register |
Name |
64-bit Mode Use |
CS |
Code Segment |
Required for privilege level (ring), not for addressing |
DS, ES, SS |
Data, Extra, Stack |
Ignored (treated as base 0) |
FS |
Extra Segment |
Thread-Local Storage on Windows (TEB) |
GS |
Extra Segment |
TLS on Linux, kernel per-CPU data |
Using FS and GS for Thread-Local Storage
; Linux: GS points to thread-local block
mov rax, [gs:0x28] ; Read stack canary (security)
; Windows: FS points to Thread Environment Block (TEB)
mov rax, [fs:0x30] ; Get PEB (Process Environment Block) pointer
mov rax, [fs:0x00] ; Current SEH chain
; Kernel mode: GS often holds per-CPU data pointer
mov rax, [gs:0x00] ; Per-CPU structure base
Setting Up Segment Base (Kernel/System Code)
; MSR-based segment base (no GDT entry needed in 64-bit)
; FS base: MSR 0xC0000100
; GS base: MSR 0xC0000101
; Kernel GS base: MSR 0xC0000102 (swapped on syscall)
; Write to FS base:
mov ecx, 0xC0000100 ; FS.base MSR
mov eax, tls_area ; Low 32 bits
mov edx, 0 ; High 32 bits (or upper bits of address)
wrmsr ; Write MSR (Ring 0 only!)
Security Note: The swapgs instruction (used in syscall handlers) atomically swaps GS base with the kernel's GS base. This prevents user mode from seeing kernel per-CPU data, critical for security.
Control Registers
Control registers configure CPU operating modes and memory management. Access requires Ring 0 (kernel) privilege.
CR0 — System Control
CR0 Layout:
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
Bit: 31 30 29 28 ... 18 16 5 4 3 2 1 0
│ PG │ CD │ NW │ │ │ AM │ WP │ NE │ ET │ TS │ EM │ MP │ PE │
└─┴──┴──┴──┴───┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┘
│ │ │ │ └─ PE: Protection Enable
│ │ │ └─ WP: Write Protect (Ring 0 can't write to R/O pages)
└─────┴─────┴─ PG: Paging Enable, CD: Cache Disable, NW: Not Write-through
; Enable Protected Mode (from real mode bootloader)
mov eax, cr0
or eax, 1 ; Set PE bit
mov cr0, eax
; Enable Paging (already in protected mode)
mov eax, cr0
or eax, (1 << 31) ; Set PG bit
mov cr0, eax
CR2 — Page Fault Address
; CR2 contains the address that caused the most recent page fault
; Used in page fault handlers:
page_fault_handler:
mov rax, cr2 ; Get faulting address
; ... determine if it's valid, map the page, etc.
iretq
CR3 — Page Table Base
; CR3 holds the physical address of the top-level page table
; PML4 in 64-bit mode, Page Directory in 32-bit
mov eax, new_page_table_phys ; Physical address of PML4
mov cr3, rax ; Flush TLB and switch address space
; Note: Writing to CR3 flushes the TLB (Translation Lookaside Buffer)
; Use INVLPG for selective TLB invalidation:
invlpg [address] ; Invalidate TLB entry for specific address
CR4 — Extended Features
; CR4 enables various CPU extensions
mov rax, cr4
or rax, (1 << 5) ; PAE: Physical Address Extension
or rax, (1 << 7) ; PGE: Page Global Enable
or rax, (1 << 9) ; OSFXSR: FXSAVE/FXRSTOR support
or rax, (1 << 10) ; OSXMMEXCPT: SIMD floating-point exceptions
mov cr4, rax
Ring 0 Only: Control registers can only be accessed from kernel mode. Attempting to read/write CRx from user mode triggers a General Protection Fault (#GP).
Flags Register (EFLAGS/RFLAGS)
The flags register tracks arithmetic results and controls CPU behavior. Understanding flags is essential for conditional branching.
RFLAGS Layout
63 21 20 19 18 17 16 ... 11 10 9 8 7 6 4 2 0
┌───────────┬───┬───┬───┬───┬───┬───────┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ Reserved │ ID│ VIP│ VIF│ AC│ VM│ ... │ OF│ DF│ IF│ TF│ SF│ ZF│ AF│ PF│ CF│
└───────────┴───┴───┴───┴───┴───┴───────┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
Arithmetic Status Flags
| Flag |
Bit |
Name |
Set When |
Use Case |
CF |
0 |
Carry Flag |
Unsigned overflow/borrow |
jc, jnc, adc, sbb |
ZF |
6 |
Zero Flag |
Result is zero |
je/jz, jne/jnz |
SF |
7 |
Sign Flag |
Result is negative (MSB=1) |
js, jns |
OF |
11 |
Overflow Flag |
Signed overflow |
jo, jno |
PF |
2 |
Parity Flag |
Low byte has even 1-bits |
Legacy, error checking |
AF |
4 |
Auxiliary Flag |
Carry from bit 3 to 4 |
BCD arithmetic |
Control Flags
; Direction Flag (DF) - controls string operation direction
cld ; Clear DF: strings go forward (SI++, DI++)
std ; Set DF: strings go backward (SI--, DI--)
; Interrupt Flag (IF) - enable/disable hardware interrupts
sti ; Enable interrupts (Ring 0 only)
cli ; Disable interrupts (Ring 0 only)
; Trap Flag (TF) - single-step debugging
; When set, CPU generates INT 1 after each instruction
Flag Operations
; Saving and restoring flags
pushfq ; Push RFLAGS onto stack
popfq ; Pop stack into RFLAGS
; Read flags into AH (low 8 bits only)
lahf ; AH = SF:ZF:0:AF:0:PF:1:CF
sahf ; Restore those flags from AH
; Directly manipulating carry flag
stc ; Set CF = 1
clc ; Clear CF = 0
cmc ; Complement (toggle) CF
Exercise: Understanding Flags
; What flags are set after each operation?
mov al, 0xFF
add al, 1 ; AL=?, CF=?, ZF=?, OF=?, SF=?
mov al, 127
add al, 1 ; AL=?, CF=?, ZF=?, OF=?, SF=?
mov al, 0
sub al, 1 ; AL=?, CF=?, ZF=?, OF=?, SF=?
Answers:
1. AL=0, CF=1 (carry out), ZF=1, OF=0, SF=0
2. AL=128 (-128), CF=0, ZF=0, OF=1 (signed overflow!), SF=1
3. AL=255 (-1), CF=1 (borrow), ZF=0, OF=0, SF=1
Debug & MSR Registers
Hardware debugging and CPU configuration registers for system-level development.
Debug Registers (DR0-DR7)
DR0-DR3: Hardware breakpoint addresses (up to 4 breakpoints)
DR4-DR5: Reserved (aliased to DR6-DR7 if not in debug extension mode)
DR6: Debug Status - which breakpoint triggered
DR7: Debug Control - enable/configure breakpoints
DR7 Breakpoint Types:
00 = Execute (instruction fetch)
01 = Write only (data)
10 = I/O read/write (if CR4.DE=1)
11 = Read/Write (data)
Setting Hardware Breakpoints
; Set a hardware breakpoint on memory write (Ring 0 only)
mov rax, target_address ; Address to watch
mov dr0, rax ; Load into DR0
; Configure DR7: enable DR0, write-only, 4-byte size
; Bits [1:0] = G0/L0 = global/local enable for DR0
; Bits [17:16] = R/W0 = condition (01 = write)
; Bits [19:18] = LEN0 = size (11 = 4 bytes)
mov rax, 0x000D0001 ; G0=1, R/W0=01, LEN0=11
mov dr7, rax
; When target_address is written, INT 1 fires
; DR6 will indicate which breakpoint triggered
GDB Uses These: When you set a hardware watchpoint in GDB (watch variable), it programs the debug registers. Software breakpoints (break) use INT 3 (opcode 0xCC) instead.
Model-Specific Registers (MSRs)
MSRs are CPU-specific configuration registers accessed via RDMSR/WRMSR:
; Read MSR (Ring 0 only)
; ECX = MSR number, result in EDX:EAX
mov ecx, 0x10 ; IA32_TIME_STAMP_COUNTER
rdmsr ; EDX:EAX = TSC value
; Write MSR
; ECX = MSR number, EDX:EAX = value to write
mov ecx, 0xC0000080 ; IA32_EFER (Extended Feature Enable)
rdmsr
or eax, (1 << 8) ; Set LME (Long Mode Enable)
wrmsr
Common MSRs
| MSR Number |
Name |
Purpose |
0x10 |
IA32_TIME_STAMP_COUNTER |
CPU cycle counter (also accessible via RDTSC) |
0xC0000080 |
IA32_EFER |
Long mode enable, NX bit enable |
0xC0000081 |
IA32_STAR |
SYSCALL/SYSRET segment selectors |
0xC0000082 |
IA32_LSTAR |
SYSCALL entry point (64-bit) |
0xC0000100 |
IA32_FS_BASE |
FS segment base address |
0xC0000101 |
IA32_GS_BASE |
GS segment base address |
User-Space: RDTSC
; RDTSC is one MSR readable from user mode (Ring 3)
; Returns 64-bit timestamp in EDX:EAX
rdtsc ; EDX:EAX = timestamp
shl rdx, 32 ; Move EDX to upper 32 bits
or rax, rdx ; Combine into RAX
; Or use RDTSCP (serializing version, also returns processor ID)
rdtscp ; EDX:EAX = timestamp, ECX = processor ID
Continue the Series
Part 2: x86 CPU Architecture Overview
Understand execution modes, privilege rings, and CPU internals.
Read Article
Part 4: Instruction Encoding & Binary Layout
Learn how assembly instructions are encoded into machine code bytes.
Read Article
Part 5: NASM Syntax, Directives & Macros
Master NASM assembler syntax, sections, data directives, and macros.
Read Article