Back to Technology

x86 Assembly Series Part 4: Instruction Encoding & Binary Layout

February 6, 2026 Wasil Zafar 30 min read

Dive deep into how x86 assembly instructions are encoded into machine code. Learn about opcodes, prefixes, ModRM byte, SIB byte, displacements, immediates, and how disassemblers decode variable-length instructions.

Table of Contents

  1. Instruction Format
  2. ModRM & SIB
  3. Displacement & Immediate
  4. Little-Endian Representation
  5. REX Prefixes (x86-64)
  6. Variable-Length Decoding
  7. How Disassemblers Work

Instruction Format

Key Concept: x86 instructions are variable-length (1-15 bytes). Understanding encoding is essential for writing shellcode, analyzing malware, and understanding compiler output.

x86 Assembly Mastery

Your 25-step learning path • Currently on Step 5
Development Environment, Tooling & Workflow
IDEs, debuggers, build tools, workflow setup
Assembly Language Fundamentals & Toolchain Setup
Syntax basics, assemblers, linkers, object files
x86 CPU Architecture Overview
Instruction pipeline, execution units, microarchitecture
Registers – Complete Deep Dive
GPRs, segment, control, flags, MSRs
5
Instruction Encoding & Binary Layout
Opcode bytes, ModR/M, SIB, prefixes, encoding schemes
You Are Here
6
NASM Syntax, Directives & Macros
Sections, labels, EQU, %macro, conditional assembly
7
Complete Assembler Comparison
NASM vs MASM vs GAS vs FASM, syntax differences
8
Memory Addressing Modes
Direct, indirect, indexed, base+displacement, RIP-relative
9
Stack Internals & Calling Conventions
Push/pop, stack frames, cdecl, System V ABI, fastcall
10
Control Flow & Procedures
Jumps, loops, conditionals, CALL/RET, function design
11
Integer, Bitwise & Arithmetic Operations
ADD, SUB, MUL, DIV, AND, OR, XOR, shifts, rotates
12
Floating Point & SIMD Foundations
x87 FPU, IEEE 754, SSE scalar, precision control
13
SIMD, Vectorization & Performance
SSE, AVX, AVX-512, data-parallel processing
14
System Calls, Interrupts & Privilege Transitions
INT, SYSCALL, IDT, ring transitions, exception handling
15
Debugging & Reverse Engineering
GDB, breakpoints, disassembly, binary analysis, IDA
16
Linking, Relocation & Loader Behavior
ELF/PE formats, symbol resolution, dynamic linking, GOT/PLT
17
x86-64 Long Mode & Advanced Features
64-bit extensions, RIP addressing, canonical addresses
18
Assembly + C/C++ Interoperability
Inline assembly, calling C from ASM, ABI compliance
19
Memory Protection & Security Concepts
DEP, ASLR, stack canaries, ROP, mitigations
20
Bootloaders & Bare-Metal Programming
BIOS/UEFI, MBR, real mode, protected mode transition
21
Kernel-Level Assembly
Context switching, interrupt handlers, TSS, GDT/LDT
22
Complete Emulator & Simulator Guide
QEMU, Bochs, instruction-level simulation, debugging VMs
23
Advanced Optimization & CPU Internals
Pipeline hazards, branch prediction, cache optimization, ILP
24
Real-World Assembly Projects
Shellcode, drivers, cryptography, signal processing
25
Assembly Mastery Capstone
Final project, comprehensive review, advanced techniques

Format Overview

Structure

x86 Instruction Layout

[Prefixes] [REX] [Opcode] [ModRM] [SIB] [Displacement] [Immediate]
 0-4 bytes  0-1   1-3      0-1     0-1    0,1,2,4        0,1,2,4

Each component is optional depending on the instruction. Only the opcode is always present.

Opcodes

The opcode identifies the operation to perform. Opcodes can be 1, 2, or 3 bytes:

x86 instruction format showing prefixes, REX, opcode, ModRM, SIB, displacement, and immediate fields
x86 variable-length instruction format — optional prefixes, REX byte, 1–3 byte opcode, ModRM, SIB, displacement, and immediate fields (1–15 bytes total)
1-byte opcodes: Most common instructions
  90       = NOP
  C3       = RET
  50-57    = PUSH reg (reg encoded in opcode itself)
  B8-BF    = MOV reg, imm32 (reg encoded in low 3 bits)

2-byte opcodes: 0F prefix
  0F 84    = JE rel32 (conditional jump)
  0F AF    = IMUL r32, r/m32
  0F B6    = MOVZX

3-byte opcodes: 0F 38 or 0F 3A prefix
  0F 38 F0 = MOVBE (byte-swap load)
  0F 3A 0F = PALIGNR (SSSE3 shuffle)

Opcode Maps

Primary opcode byte (partial map):

        x0   x1   x2   x3   x4   x5   x6   x7   x8   x9   xA   xB   xC   xD   xE   xF
  0x  ADD  ADD  ADD  ADD  ADD  ADD  PUSH POP  OR   OR   OR   OR   OR   OR  PUSH 2-byte
  1x  ADC  ADC  ADC  ADC  ADC  ADC  PUSH POP  SBB  SBB  SBB  SBB  SBB  SBB  PUSH POP
  ...
  5x  PUSH PUSH PUSH PUSH PUSH PUSH PUSH PUSH POP  POP  POP  POP  POP  POP  POP  POP
  ...
  Bx  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV  MOV

Exercise: Decode an Opcode

# Disassemble a single instruction
echo -ne '\xB8\x2A\x00\x00\x00' | ndisasm -b 64 -
# Output: mov eax,0x2a

# Breakdown: B8 = MOV EAX, imm32 (B8 + register 0)
# 2A 00 00 00 = 0x0000002A in little-endian = 42

Instruction Prefixes

Prefixes modify instruction behavior. They must appear before the opcode:

Legacy Prefix Groups

Group Bytes Purpose
Group 1 (Lock/Rep) F0, F2, F3 LOCK, REPNE/REPNZ, REP/REPE/REPZ
Group 2 (Segment) 26, 2E, 36, 3E, 64, 65 ES, CS, SS, DS, FS, GS override
Group 3 (Operand Size) 66 Toggle 16/32-bit operand size
Group 4 (Address Size) 67 Toggle 32/64-bit address size

Operand-Size Override (66h)

; In 64-bit mode, default operand size is 32-bit
mov eax, [rbx]           ; B8 XX XX XX XX  (32-bit)
mov ax, [rbx]            ; 66 8B 03        (66h makes it 16-bit)
mov rax, [rbx]           ; 48 8B 03        (REX.W makes it 64-bit)

Rep Prefixes for String Operations

; F3 = REP prefix
rep movsb    ; F3 A4 - Copy RCX bytes from [RSI] to [RDI]
rep stosq    ; F3 48 AB - Fill RCX quadwords at [RDI] with RAX

; F2 = REPNE prefix
repne scasb  ; F2 AE - Scan for AL in [RDI], stop when found

LOCK Prefix for Atomics

; F0 = LOCK prefix (guarantees atomic read-modify-write)
lock inc dword [counter]     ; F0 FF 05 ... - Atomic increment
lock cmpxchg [mutex], ecx    ; F0 0F B1 0D ... - Atomic compare-exchange
Prefix Rules:
  • Only one prefix from each group is allowed
  • Order within groups doesn't matter (but conventional order helps disassemblers)
  • Invalid LOCK usage causes #UD (Undefined Opcode) exception
  • 66h prefix repurposed for SSE2 double-precision FP

ModRM & SIB Bytes

ModRM Byte Encoding

Encoding

ModRM Structure (8 bits)

| Mod (2 bits) | Reg (3 bits) | R/M (3 bits) |
|    7-6       |     5-3      |     2-0      |
  • Mod: Addressing mode (00=memory, 01/10=memory+disp, 11=register)
  • Reg: Register operand or opcode extension
  • R/M: Register/Memory operand

SIB Byte (Scale-Index-Base)

The SIB byte enables complex addressing modes: [base + index*scale + disp]

ModRM and SIB byte bit field encoding showing Mod, Reg, R/M, Scale, Index, and Base fields
ModRM and SIB byte encoding — bit field breakdown showing how Mod, Reg, R/M, Scale, Index, and Base fields encode operand addressing modes
SIB Structure (8 bits):
| Scale (2 bits) | Index (3 bits) | Base (3 bits) |
|     7-6        |      5-3       |      2-0      |

Scale values:
  00 = ×1 (no scaling)
  01 = ×2
  10 = ×4
  11 = ×8

Special cases:
  Index = 100 (RSP): No index register used
  Base = 101 with Mod=00: No base, displacement only (RIP-relative)

SIB Examples

; Array access: arr[i*4]
mov eax, [rbx + rcx*4]        ; SIB = 10_001_011 = 0x8B
                               ; Scale=10(×4), Index=001(RCX), Base=011(RBX)

; 2D array: arr[row][col] where sizeof(row) = 8
mov eax, [rdi + rsi*8 + 16]   ; Base=RDI, Index=RSI, Scale=8, Disp=16

; No base, just scaled index + displacement
mov eax, [rcx*4 + table]      ; ModRM Mod=00, R/M=100 triggers SIB
                               ; SIB Base=101 means no base, use disp32

Decode Challenge

# Instruction: 8B 04 8D 00 00 00 00
# 8B = MOV r32, r/m32
# ModRM 04 = Mod=00, Reg=000(EAX), R/M=100(SIB follows)
# SIB 8D = Scale=10(×4), Index=001(ECX), Base=101(disp32, no base)
# Disp32 = 00 00 00 00
# Result: mov eax, [ecx*4 + 0x0]
Why SIB Matters: Array indexing (arr[i]) compiles to scaled addressing. Understanding SIB helps you read disassembly and optimize memory access patterns for cache efficiency.

Displacement & Immediate

These fields encode constant values embedded in the instruction.

Displacement and immediate value encoding in x86 instructions with size variants
Displacement and immediate fields — how constant offsets (8/16/32-bit) and literal values are encoded within x86 instruction bytes

Displacement (Memory Offset)

Displacement size depends on ModRM Mod field:
  Mod = 00: No displacement (except special cases)
  Mod = 01: 8-bit signed displacement (sign-extended)
  Mod = 10: 32-bit displacement (or 16-bit in 16-bit mode)
  Mod = 11: No memory access (register-to-register)
; No displacement
mov eax, [rbx]            ; ModRM = 03 (Mod=00)

; 8-bit displacement (efficient for small offsets)
mov eax, [rbx + 8]        ; ModRM = 43 (Mod=01), Disp8 = 08

; 32-bit displacement (required for large offsets)
mov eax, [rbx + 0x1000]   ; ModRM = 83 (Mod=10), Disp32 = 00 10 00 00

; The assembler picks the smallest encoding automatically

Immediate Values

; Immediate size matches operand size (usually)
mov al, 42               ; B0 2A  (8-bit immediate)
mov ax, 1000             ; 66 B8 E8 03  (16-bit, little-endian)
mov eax, 0x12345678      ; B8 78 56 34 12  (32-bit)
mov rax, 0x123456789ABC  ; 48 B8 BC 9A 78 56 34 12 00 00  (64-bit!!)

; Sign-extended immediates save space
add rax, 1               ; 48 83 C0 01  (8-bit sign-extended to 64)
add rax, 0x7FFFFFFF      ; 48 05 FF FF FF 7F  (32-bit sign-ext)
; Note: Can't add 64-bit immediate! Must use mov first
64-bit Immediate Limitation: Only MOV reg64, imm64 supports full 64-bit immediates. Other instructions use 32-bit sign-extended immediates. This is why mov rax, big_constant; add rbx, rax is sometimes needed.

Little-Endian Representation

x86 stores multi-byte values with the least significant byte first. This affects how you read hex dumps.

Little-endian versus big-endian byte ordering showing how 0x12345678 is stored in memory
Little-endian vs big-endian byte ordering — x86 stores the least significant byte at the lowest address, affecting how multi-byte values appear in hex dumps

Little-Endian vs Big-Endian

Value: 0x12345678

Little-Endian (x86):        Big-Endian (Network/MIPS):
Address  Byte               Address  Byte
0x100    78 (LSB)           0x100    12 (MSB)
0x101    56                 0x101    34
0x102    34                 0x102    56
0x103    12 (MSB)           0x103    78 (LSB)

Memory view: 78 56 34 12    Memory view: 12 34 56 78

Practical Implications

; Instruction: mov eax, 0xDEADBEEF
; Encoding: B8 EF BE AD DE
;           ^^ opcode
;              ^^ ^^ ^^ ^^ immediate in little-endian!

; When you see this in a hex dump:
; 48 C7 C0 2A 00 00 00
; It's: mov rax, 0x0000002A (42 decimal)
; NOT: mov rax, 0x2A000000

Exercise: Read Addresses in Hex Dumps

# Disassemble with address display
echo -ne '\xE9\x1B\x00\x00\x00' | ndisasm -b 64 -
# Output: jmp near 0x20

# The offset 0x0000001B + instruction length (5) = 0x20
# Bytes E9 1B 00 00 00 = JMP rel32
# 1B 00 00 00 in little-endian = 0x0000001B
Network Programming: Network protocols use big-endian ("network byte order"). Use bswap instruction or htons()/ntohs() functions when sending/receiving multi-byte values over the network.

REX Prefixes (x86-64)

REX prefixes enable 64-bit operands and access to registers R8-R15.

REX Byte Structure

REX prefix: 0100 WRXB (0x40-0x4F)

  Bit 3 (W): 64-bit operand size (instead of default 32-bit)
  Bit 2 (R): Extends ModRM.reg to 4 bits (access R8-R15)
  Bit 1 (X): Extends SIB.index to 4 bits
  Bit 0 (B): Extends ModRM.r/m or SIB.base to 4 bits

REX prefix values:
  40 = REX (enables new 8-bit registers like SIL)
  48 = REX.W (64-bit operand)
  41 = REX.B (extended R/M)
  44 = REX.R (extended Reg)
  4D = REX.WRB (64-bit + extended Reg + extended R/M)

REX Examples

; Without REX
mov eax, ebx              ; 89 D8 (32-bit, uses low 8 registers)

; REX.W for 64-bit operand
mov rax, rbx              ; 48 89 D8 (64-bit)

; REX.R to access R8-R15 in Reg field
mov r8d, eax              ; 44 89 C0 (R8D, 32-bit)
mov r8, rax               ; 4C 89 C0 (R8, 64-bit, REX.WR)

; REX.B to access R8-R15 in R/M field  
mov eax, r8d              ; 41 89 C0
mov rax, r8               ; 49 89 C0 (REX.WB)

; REX.X for SIB index extension
mov eax, [rbx + r8*4]     ; 42 8B 04 83 (REX.X)

REX and 8-bit Registers

; REX presence changes 8-bit register encoding!
; Without REX: AH, CH, DH, BH accessible (codes 4-7)
; With REX: SPL, BPL, SIL, DIL accessible (codes 4-7)

mov ah, 5                 ; 80 C4 05 (no REX, AH = code 4)
mov spl, 5                ; 40 80 C4 05 (REX enables SPL = code 4)
mov r8b, 5                ; 41 B0 05 (REX.B for R8B)
Incompatibility: You cannot use AH, BH, CH, DH in the same instruction with R8-R15 or the new low-byte registers (SIL, DIL, BPL, SPL). REX presence blocks the high-byte encodings.

Variable-Length Decoding

x86's variable-length instructions (1-15 bytes) create unique challenges for both CPUs and disassemblers.

CPU instruction decoder pipeline parsing variable-length x86 instructions from byte stream
Variable-length instruction decoding — how the CPU decoder parses prefixes, opcodes, and operand fields from a continuous byte stream to determine instruction boundaries

CPU Decoding Process

Modern x86 Decoder Pipeline:

1. Fetch: Load 16-32 bytes from instruction stream
          (aligned fetch, branch prediction critical)

2. Pre-decode: Find instruction boundaries
              - Scan for prefixes (0x40-0x4F = REX, 0x66, 0xF2, etc.)
              - Identify opcode (1, 2, or 3 bytes: 0F, 0F38, 0F3A)
              - Determine if ModRM/SIB/Disp/Imm follow

3. Decode: Convert to micro-ops
          - Simple: 1 µop (mov, add register)
          - Complex: Multiple µops via microcode ROM

4. Queue: Place µops in execution queue

Instruction Length Determination

Length Calculation Algorithm:

length = 0

// Count prefixes (Groups 1-4)
while (byte is legacy prefix or REX):
    length++
    advance

// Opcode (1-3 bytes)
if byte == 0x0F:
    length++
    if next == 0x38 or 0x3A:
        length++  // 3-byte opcode
    length++  // 2-byte opcode
else:
    length++  // 1-byte opcode

// ModRM present? (depends on opcode)
if opcode_needs_modrm:
    parse_modrm()
    length++
    if modrm.rm == 100b:  // SIB follows
        length++
    length += displacement_size(modrm.mod)

// Immediate? (depends on opcode)
length += immediate_size(opcode)

Decoding Challenges

Challenge Problem Solution
Variable length Can't know length until parsing Speculative pre-decode, large fetch
Prefix stacking Up to 4 mandatory + REX + VEX Prefix decoder state machine
Opcode ambiguity Same byte means different things Context-dependent decode tables
Branch targets May land mid-instruction Can't cache decoded instructions (code morphing)
Self-Modifying Code: x86 allows code modification, but the CPU's decoded instruction cache must be invalidated. The CPUID instruction is often used as a serializing fence after code modification.

How Disassemblers Work

Disassemblers face the inverse problem: given machine code bytes, recover the original assembly.

Linear Sweep Disassembly

Algorithm:
1. Start at entry point
2. Decode instruction, advance by its length
3. Repeat until end of section

Pros: Simple, fast
Cons: Fooled by:
  - Data embedded in code
  - Jump tables
  - Anti-disassembly tricks
# objdump uses linear sweep
objdump -d ./binary

# Problem: if code jumps over data, linear sweep reads data as code
# 00001000: jmp 0x1008
# 00001002: db "HELLO"   ← Linear sweep tries to decode as instructions!
# 00001008: mov eax, 1  ← Real code continues

Recursive Descent Disassembly

Algorithm:
1. Start at entry point, add to work list
2. While work list not empty:
   a. Pop address
   b. Decode instruction
   c. If unconditional jump: add target to work list
   d. If conditional jump: add both paths
   e. If call: add target AND next instruction
   f. If ret: stop this path

Pros: Follows actual control flow, skips embedded data
Cons:
  - Indirect jumps (jmp rax) can't be resolved statically
  - May miss dead code
  - Doesn't handle computed jump tables easily

Anti-Disassembly Techniques

; Opaque predicate: always true, but disassembler doesn't know
mov eax, 1
test eax, eax
jz fake_branch        ; Never taken, but disassembler follows it
                      ; Real code here
fake_branch:
  db 0xE8             ; Looks like CALL, corrupts next instruction decode

; Jump into middle of instruction
mov eax, 0x90909090   ; B8 90 90 90 90
jmp $-3               ; Jump to third 90 = NOP, skipping B8

Disassembler Tools

Tool Method Best For
objdump Linear sweep Quick inspection, well-formed binaries
ndisasm Linear sweep Raw binary blobs, boot sectors
IDA Pro / Ghidra Recursive + heuristics Malware, obfuscated code, full RE
Capstone library On-demand Building custom tools, dynamic analysis

Exercise: Compare Disassemblers

# Create a binary with embedded data
echo -ne '\xEB\x05HELLO\xB8\x01\x00\x00\x00\xC3' > test.bin

# Linear sweep (ndisasm)
ndisasm -b 64 test.bin
# 00000000  EB05      jmp short 0x7
# 00000002  48        dec eax        ← 'H' decoded as instruction!
# ...

# The jmp should skip "HELLO" but linear sweep decodes it