Back to Technology

x86 Assembly Series Part 3: Registers – Complete Deep Dive

February 6, 2026 Wasil Zafar 35 min read

Master all x86/x64 registers including general-purpose registers (RAX-R15), segment registers, control registers (CR0-CR4), the flags register, debug registers, and model-specific registers (MSRs).

Table of Contents

  1. General-Purpose Registers
  2. Index & Pointer Registers
  3. Segment Registers
  4. Control Registers
  5. Flags Register
  6. Debug & MSR Registers

General-Purpose Registers

Core Concept: General-purpose registers are the CPU's working memory. x86-64 provides 16 64-bit general-purpose registers (RAX-R15) that can be accessed in different sizes for backward compatibility.

x86 Assembly Mastery

Your 25-step learning path • Currently on Step 4
Development Environment, Tooling & Workflow
IDEs, debuggers, build tools, workflow setup
Assembly Language Fundamentals & Toolchain Setup
Syntax basics, assemblers, linkers, object files
x86 CPU Architecture Overview
Instruction pipeline, execution units, microarchitecture
4
Registers – Complete Deep Dive
GPRs, segment, control, flags, MSRs
You Are Here
5
Instruction Encoding & Binary Layout
Opcode bytes, ModR/M, SIB, prefixes, encoding schemes
6
NASM Syntax, Directives & Macros
Sections, labels, EQU, %macro, conditional assembly
7
Complete Assembler Comparison
NASM vs MASM vs GAS vs FASM, syntax differences
8
Memory Addressing Modes
Direct, indirect, indexed, base+displacement, RIP-relative
9
Stack Internals & Calling Conventions
Push/pop, stack frames, cdecl, System V ABI, fastcall
10
Control Flow & Procedures
Jumps, loops, conditionals, CALL/RET, function design
11
Integer, Bitwise & Arithmetic Operations
ADD, SUB, MUL, DIV, AND, OR, XOR, shifts, rotates
12
Floating Point & SIMD Foundations
x87 FPU, IEEE 754, SSE scalar, precision control
13
SIMD, Vectorization & Performance
SSE, AVX, AVX-512, data-parallel processing
14
System Calls, Interrupts & Privilege Transitions
INT, SYSCALL, IDT, ring transitions, exception handling
15
Debugging & Reverse Engineering
GDB, breakpoints, disassembly, binary analysis, IDA
16
Linking, Relocation & Loader Behavior
ELF/PE formats, symbol resolution, dynamic linking, GOT/PLT
17
x86-64 Long Mode & Advanced Features
64-bit extensions, RIP addressing, canonical addresses
18
Assembly + C/C++ Interoperability
Inline assembly, calling C from ASM, ABI compliance
19
Memory Protection & Security Concepts
DEP, ASLR, stack canaries, ROP, mitigations
20
Bootloaders & Bare-Metal Programming
BIOS/UEFI, MBR, real mode, protected mode transition
21
Kernel-Level Assembly
Context switching, interrupt handlers, TSS, GDT/LDT
22
Complete Emulator & Simulator Guide
QEMU, Bochs, instruction-level simulation, debugging VMs
23
Advanced Optimization & CPU Internals
Pipeline hazards, branch prediction, cache optimization, ILP
24
Real-World Assembly Projects
Shellcode, drivers, cryptography, signal processing
25
Assembly Mastery Capstone
Final project, comprehensive review, advanced techniques

The RAX Family (Accumulator)

Register

RAX / EAX / AX / AH / AL

; RAX (64-bit) = |63--------------------32|31---------16|15---8|7---0|
;                                          |    EAX      |  AX  |     |
;                                          |             |AH    |AL   |

mov rax, 0x123456789ABCDEF0  ; Full 64-bit
mov eax, 0x12345678          ; Lower 32-bit (zeros upper 32)
mov ax, 0x1234               ; Lower 16-bit
mov ah, 0x12                 ; High byte of AX
mov al, 0x34                 ; Low byte of AX

Complete General-Purpose Set

Reference

x86-64 Register Map

64-bit32-bit16-bit8-bit High8-bit LowTraditional Use
RAXEAXAXAHALAccumulator
RBXEBXBXBHBLBase
RCXECXCXCHCLCounter
RDXEDXDXDHDLData
RSIESISI-SILSource Index
RDIEDIDI-DILDestination Index
RBPEBPBP-BPLBase Pointer
RSPESPSP-SPLStack Pointer
R8-R15R8D-R15DR8W-R15W-R8B-R15Bx64 Extended

Sub-Register Access Gotchas

Understanding how partial register writes behave is crucial to avoid subtle bugs:

The Zero-Extension Rule (64-bit mode)

; CRITICAL RULE: Writing to 32-bit register ZEROS the upper 32 bits!
mov rax, 0xFFFFFFFF_FFFFFFFF  ; RAX = full 64-bit value
mov eax, 0x12345678           ; RAX = 0x00000000_12345678 (!)

; But 8-bit and 16-bit writes DO NOT zero-extend:
mov rax, 0xFFFFFFFF_FFFFFFFF  ; RAX = full 64-bit value
mov ax, 0x1234                ; RAX = 0xFFFFFFFF_FFFF1234
mov al, 0x56                  ; RAX = 0xFFFFFFFF_FFFF1256
Common Bug Source: Forgetting that mov eax, val clears the upper 32 bits of RAX. This is intentional (avoids partial register stalls) but catches beginners. Use mov rax, val or explicit zero-extension when needed.

Performance: Partial Register Stalls

; This code may stall on older CPUs:
mov rax, 0
mov ah, 1          ; Write to partial register (AH)
mov rbx, rax       ; Read full register - possible stall!

; Better: Avoid AH/BH/CH/DH in 64-bit code
movzx eax, byte [value]  ; Zero-extend to full register
shl eax, 8               ; Shift to "AH position" if needed

REX Prefix Impact

; The high-byte registers (AH, BH, CH, DH) cannot be used
; when any REX prefix is present (which is required for R8-R15)

mov ah, 5           ; OK: no REX needed
mov r8b, 5          ; OK: uses REX prefix
mov ah, r8b         ; ERROR: Can't encode AH with REX prefix!

; New low-byte registers (SIL, DIL, BPL, SPL) require REX
mov sil, 5          ; OK: REX prefix generated automatically

Exercise: Predict the Output

xor rax, rax           ; RAX = ?
mov eax, 0xDEADBEEF    ; RAX = ?
mov ax, 0x1234         ; RAX = ?
mov al, 0xFF           ; RAX = ?
; Final RAX value?

Answer: 0x00000000_DEAD12FF — EAX write zeroed upper bits, AX write preserved upper 48 bits, AL write preserved upper 56 bits.

Index & Pointer Registers

These registers have hardware-supported roles in memory addressing and stack operations.

Index and pointer registers showing RSP, RBP, RSI, and RDI roles in stack and memory operations
Index and pointer registers — RSP (stack pointer), RBP (base pointer), RSI (source index), and RDI (destination index) and their roles in stack frames and memory addressing

RSP — Stack Pointer

; RSP always points to the TOP of the stack (last pushed value)
; Stack grows DOWNWARD on x86!

push rax            ; RSP -= 8, then [RSP] = RAX
pop rbx             ; RBX = [RSP], then RSP += 8

; Direct stack manipulation:
sub rsp, 32         ; Reserve 32 bytes on stack
mov [rsp+8], rdi    ; Store value in reserved space
add rsp, 32         ; Release reserved space

; CRITICAL: RSP must be 16-byte aligned before CALL on x86-64!
; The ABI expects it. Violating this crashes on some SIMD instructions.

RBP — Base Pointer (Frame Pointer)

; Traditional stack frame setup
my_function:
    push rbp            ; Save caller's frame pointer
    mov rbp, rsp        ; Establish our frame
    sub rsp, 32         ; Local variables
    
    ; Access locals via RBP (constant offset throughout function)
    mov [rbp-8], rdi    ; First local variable
    mov [rbp-16], rsi   ; Second local variable
    
    ; Access parameters (after return address and saved RBP)
    ; Stack args (if any) at [rbp+16], [rbp+24], ...
    
    leave               ; Equivalent to: mov rsp, rbp; pop rbp
    ret

; Frame pointer can be omitted (-fomit-frame-pointer) for more registers
; But debugging becomes harder

RSI & RDI — Source & Destination Index

; Originally for string operations (auto-increment/decrement)
mov rsi, source_buffer
mov rdi, dest_buffer
mov rcx, 100            ; Count
cld                     ; Clear direction flag (forward)
rep movsb               ; Copy RCX bytes from [RSI] to [RDI]

; Also used as first two arguments in System V AMD64 ABI:
; my_func(arg1, arg2) → RDI=arg1, RSI=arg2
Calling Conventions:
  • System V AMD64 (Linux, macOS): RDI, RSI, RDX, RCX, R8, R9
  • Microsoft x64 (Windows): RCX, RDX, R8, R9
  • Return value: RAX (and RDX for 128-bit returns)

Segment Registers

Legacy from segmented memory days, but still relevant for special purposes in 64-bit mode.

Segment registers CS, DS, SS, FS, GS and their roles in memory segmentation and thread-local storage
x86 segment registers — CS for code privilege level, FS/GS for thread-local storage (TLS), and legacy DS/ES/SS registers in 64-bit flat memory model

Segment Register Overview

Register Name 64-bit Mode Use
CS Code Segment Required for privilege level (ring), not for addressing
DS, ES, SS Data, Extra, Stack Ignored (treated as base 0)
FS Extra Segment Thread-Local Storage on Windows (TEB)
GS Extra Segment TLS on Linux, kernel per-CPU data

Using FS and GS for Thread-Local Storage

; Linux: GS points to thread-local block
mov rax, [gs:0x28]       ; Read stack canary (security)

; Windows: FS points to Thread Environment Block (TEB)
mov rax, [fs:0x30]       ; Get PEB (Process Environment Block) pointer
mov rax, [fs:0x00]       ; Current SEH chain

; Kernel mode: GS often holds per-CPU data pointer
mov rax, [gs:0x00]       ; Per-CPU structure base

Setting Up Segment Base (Kernel/System Code)

; MSR-based segment base (no GDT entry needed in 64-bit)
; FS base: MSR 0xC0000100
; GS base: MSR 0xC0000101  
; Kernel GS base: MSR 0xC0000102 (swapped on syscall)

; Write to FS base:
mov ecx, 0xC0000100      ; FS.base MSR
mov eax, tls_area        ; Low 32 bits
mov edx, 0               ; High 32 bits (or upper bits of address)
wrmsr                    ; Write MSR (Ring 0 only!)
Security Note: The swapgs instruction (used in syscall handlers) atomically swaps GS base with the kernel's GS base. This prevents user mode from seeing kernel per-CPU data, critical for security.

Control Registers

Control registers configure CPU operating modes and memory management. Access requires Ring 0 (kernel) privilege.

Control registers CR0, CR2, CR3, CR4 showing bit fields for paging, protection, and memory management
x86 control registers — CR0 (protection/paging enable), CR2 (page fault address), CR3 (page table base), and CR4 (extended CPU features)

CR0 — System Control

CR0 Layout:
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
Bit: 31  30  29  28  ...  18  16   5   4   3   2   1   0
    │ PG │ CD │ NW │   │   │ AM │ WP │ NE │ ET │ TS │ EM │ MP │ PE │
    └─┴──┴──┴──┴───┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┘
     │     │     │                      │                        └─ PE: Protection Enable
     │     │     │                      └─ WP: Write Protect (Ring 0 can't write to R/O pages)
     └─────┴─────┴─ PG: Paging Enable, CD: Cache Disable, NW: Not Write-through
; Enable Protected Mode (from real mode bootloader)
mov eax, cr0
or eax, 1           ; Set PE bit
mov cr0, eax

; Enable Paging (already in protected mode)
mov eax, cr0
or eax, (1 << 31)   ; Set PG bit
mov cr0, eax

CR2 — Page Fault Address

; CR2 contains the address that caused the most recent page fault
; Used in page fault handlers:
page_fault_handler:
    mov rax, cr2     ; Get faulting address
    ; ... determine if it's valid, map the page, etc.
    iretq

CR3 — Page Table Base

; CR3 holds the physical address of the top-level page table
; PML4 in 64-bit mode, Page Directory in 32-bit

mov eax, new_page_table_phys  ; Physical address of PML4
mov cr3, rax                   ; Flush TLB and switch address space

; Note: Writing to CR3 flushes the TLB (Translation Lookaside Buffer)
; Use INVLPG for selective TLB invalidation:
invlpg [address]               ; Invalidate TLB entry for specific address

CR4 — Extended Features

; CR4 enables various CPU extensions
mov rax, cr4
or rax, (1 << 5)    ; PAE: Physical Address Extension
or rax, (1 << 7)    ; PGE: Page Global Enable
or rax, (1 << 9)    ; OSFXSR: FXSAVE/FXRSTOR support
or rax, (1 << 10)   ; OSXMMEXCPT: SIMD floating-point exceptions
mov cr4, rax
Ring 0 Only: Control registers can only be accessed from kernel mode. Attempting to read/write CRx from user mode triggers a General Protection Fault (#GP).

Flags Register (EFLAGS/RFLAGS)

The flags register tracks arithmetic results and controls CPU behavior. Understanding flags is essential for conditional branching.

EFLAGS/RFLAGS register bit layout showing CF, ZF, SF, OF status flags and IF, DF control flags
RFLAGS register bit layout — arithmetic status flags (CF, ZF, SF, OF, PF, AF) and control flags (DF, IF, TF) used for conditional branching and CPU behavior

RFLAGS Layout

   63         21  20  19  18  17  16   ...   11  10   9   8   7   6   4   2   0
  ┌───────────┬───┬───┬───┬───┬───┬───────┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
  │ Reserved  │ ID│ VIP│ VIF│ AC│ VM│  ...  │ OF│ DF│ IF│ TF│ SF│ ZF│ AF│ PF│ CF│
  └───────────┴───┴───┴───┴───┴───┴───────┴───┴───┴───┴───┴───┴───┴───┴───┴───┘

Arithmetic Status Flags

Flag Bit Name Set When Use Case
CF 0 Carry Flag Unsigned overflow/borrow jc, jnc, adc, sbb
ZF 6 Zero Flag Result is zero je/jz, jne/jnz
SF 7 Sign Flag Result is negative (MSB=1) js, jns
OF 11 Overflow Flag Signed overflow jo, jno
PF 2 Parity Flag Low byte has even 1-bits Legacy, error checking
AF 4 Auxiliary Flag Carry from bit 3 to 4 BCD arithmetic

Control Flags

; Direction Flag (DF) - controls string operation direction
cld                 ; Clear DF: strings go forward (SI++, DI++)
std                 ; Set DF: strings go backward (SI--, DI--)

; Interrupt Flag (IF) - enable/disable hardware interrupts
sti                 ; Enable interrupts (Ring 0 only)
cli                 ; Disable interrupts (Ring 0 only)

; Trap Flag (TF) - single-step debugging
; When set, CPU generates INT 1 after each instruction

Flag Operations

; Saving and restoring flags
pushfq              ; Push RFLAGS onto stack
popfq               ; Pop stack into RFLAGS

; Read flags into AH (low 8 bits only)
lahf                ; AH = SF:ZF:0:AF:0:PF:1:CF
sahf                ; Restore those flags from AH

; Directly manipulating carry flag
stc                 ; Set CF = 1
clc                 ; Clear CF = 0  
cmc                 ; Complement (toggle) CF

Exercise: Understanding Flags

; What flags are set after each operation?
mov al, 0xFF
add al, 1           ; AL=?, CF=?, ZF=?, OF=?, SF=?

mov al, 127
add al, 1           ; AL=?, CF=?, ZF=?, OF=?, SF=?

mov al, 0
sub al, 1           ; AL=?, CF=?, ZF=?, OF=?, SF=?

Answers:
1. AL=0, CF=1 (carry out), ZF=1, OF=0, SF=0
2. AL=128 (-128), CF=0, ZF=0, OF=1 (signed overflow!), SF=1
3. AL=255 (-1), CF=1 (borrow), ZF=0, OF=0, SF=1

Debug & MSR Registers

Hardware debugging and CPU configuration registers for system-level development.

Debug registers DR0-DR7 and MSR registers used for hardware breakpoints and CPU configuration
Debug registers (DR0–DR7) for hardware breakpoints and Model-Specific Registers (MSRs) for CPU feature configuration and performance monitoring

Debug Registers (DR0-DR7)

DR0-DR3: Hardware breakpoint addresses (up to 4 breakpoints)
DR4-DR5: Reserved (aliased to DR6-DR7 if not in debug extension mode)
DR6: Debug Status - which breakpoint triggered
DR7: Debug Control - enable/configure breakpoints

DR7 Breakpoint Types:
  00 = Execute (instruction fetch)
  01 = Write only (data)
  10 = I/O read/write (if CR4.DE=1)
  11 = Read/Write (data)

Setting Hardware Breakpoints

; Set a hardware breakpoint on memory write (Ring 0 only)
mov rax, target_address      ; Address to watch
mov dr0, rax                 ; Load into DR0

; Configure DR7: enable DR0, write-only, 4-byte size
; Bits [1:0] = G0/L0 = global/local enable for DR0
; Bits [17:16] = R/W0 = condition (01 = write)
; Bits [19:18] = LEN0 = size (11 = 4 bytes)
mov rax, 0x000D0001          ; G0=1, R/W0=01, LEN0=11
mov dr7, rax

; When target_address is written, INT 1 fires
; DR6 will indicate which breakpoint triggered
GDB Uses These: When you set a hardware watchpoint in GDB (watch variable), it programs the debug registers. Software breakpoints (break) use INT 3 (opcode 0xCC) instead.

Model-Specific Registers (MSRs)

MSRs are CPU-specific configuration registers accessed via RDMSR/WRMSR:

; Read MSR (Ring 0 only)
; ECX = MSR number, result in EDX:EAX
mov ecx, 0x10               ; IA32_TIME_STAMP_COUNTER
rdmsr                        ; EDX:EAX = TSC value

; Write MSR  
; ECX = MSR number, EDX:EAX = value to write
mov ecx, 0xC0000080          ; IA32_EFER (Extended Feature Enable)
rdmsr
or eax, (1 << 8)             ; Set LME (Long Mode Enable)
wrmsr

Common MSRs

MSR Number Name Purpose
0x10 IA32_TIME_STAMP_COUNTER CPU cycle counter (also accessible via RDTSC)
0xC0000080 IA32_EFER Long mode enable, NX bit enable
0xC0000081 IA32_STAR SYSCALL/SYSRET segment selectors
0xC0000082 IA32_LSTAR SYSCALL entry point (64-bit)
0xC0000100 IA32_FS_BASE FS segment base address
0xC0000101 IA32_GS_BASE GS segment base address

User-Space: RDTSC

; RDTSC is one MSR readable from user mode (Ring 3)
; Returns 64-bit timestamp in EDX:EAX

rdtsc                ; EDX:EAX = timestamp
shl rdx, 32          ; Move EDX to upper 32 bits
or rax, rdx          ; Combine into RAX

; Or use RDTSCP (serializing version, also returns processor ID)
rdtscp               ; EDX:EAX = timestamp, ECX = processor ID