x86 Assembly Series Part 11: Floating-Point & x87/SSE

IEEE 754 Representation

                        
                        IEEE 754: The standard for floating-point arithmetic. A float consists of sign bit, exponent, and mantissa (fraction). Understanding this format is essential for debugging FP code.
                    

x86 Assembly Mastery

Your 25-step learning path • Currently on Step 12

Floating Point & SIMD Foundations

x87 FPU, IEEE 754, SSE scalar, precision control

You Are Here

SIMD, Vectorization & Performance

SSE, AVX, AVX-512, data-parallel processing

System Calls, Interrupts & Privilege Transitions

INT, SYSCALL, IDT, ring transitions, exception handling

Debugging & Reverse Engineering

GDB, breakpoints, disassembly, binary analysis, IDA

Linking, Relocation & Loader Behavior

ELF/PE formats, symbol resolution, dynamic linking, GOT/PLT

x86-64 Long Mode & Advanced Features

64-bit extensions, RIP addressing, canonical addresses

Assembly + C/C++ Interoperability

Inline assembly, calling C from ASM, ABI compliance

Memory Protection & Security Concepts

DEP, ASLR, stack canaries, ROP, mitigations

Bootloaders & Bare-Metal Programming

BIOS/UEFI, MBR, real mode, protected mode transition

Kernel-Level Assembly

Context switching, interrupt handlers, TSS, GDT/LDT

Complete Emulator & Simulator Guide

QEMU, Bochs, instruction-level simulation, debugging VMs

Advanced Optimization & CPU Internals

Pipeline hazards, branch prediction, cache optimization, ILP

Real-World Assembly Projects

Shellcode, drivers, cryptography, signal processing

Assembly Mastery Capstone

Final project, comprehensive review, advanced techniques

IEEE 754 floating-point bit layout showing sign, exponent, and mantissa fields — IEEE 754 encodes floating-point numbers as three fields: sign bit, biased exponent, and normalized mantissa (fraction)

Single Precision (32-bit float)

Format

IEEE 754 Single Precision

| Sign (1 bit) | Exponent (8 bits) | Mantissa (23 bits) |
|     31       |      30-23        |        22-0        |

Value = (-1)^S × 1.M × 2^(E-127)

Examples:
1.0  = 0x3F800000 = 0 01111111 00000000000000000000000
-2.0 = 0xC0000000 = 1 10000000 00000000000000000000000
3.14 ≈ 0x4048F5C3

Double Precision (64-bit double)

Format

IEEE 754 Double Precision

| Sign (1 bit) | Exponent (11 bits) | Mantissa (52 bits) |
|     63       |       62-52        |        51-0        |

Value = (-1)^S × 1.M × 2^(E-1023)

Examples:
1.0  = 0x3FF0000000000000
-2.0 = 0xC000000000000000
3.14159265358979 ≈ 0x400921FB54442D18

Special Values:
+∞  = 0x7FF0000000000000 (exponent all 1s, mantissa 0)
-∞  = 0xFFF0000000000000
NaN = 0x7FF8000000000000 (exponent all 1s, mantissa non-zero)

Precision Comparison

Type	Bits	Significant Digits	Range
Float (single)	32	~7	±1.18×10⁻³⁸ to ±3.4×10³⁸
Double	64	~15-16	±2.23×10⁻³⁰⁸ to ±1.8×10³⁰⁸
x87 Extended	80	~19	±3.65×10⁻⁴⁹³² to ±1.18×10⁴⁹³²

x87 FPU (Legacy)

FPU Stack Model

                        
                        Stack-Based: x87 uses an 8-register stack (ST0-ST7). Operations typically work on ST0 (top of stack). Results are pushed, consumed operands may be popped.
                    

x87 FPU register stack diagram showing ST0 through ST7 with push and pop operations — The x87 FPU operates on an 8-deep register stack (ST0–ST7) where ST0 is always the top — FLD pushes and FSTP pops

FPU Operations

section .data
    value1 dq 3.14159
    value2 dq 2.71828
    result dq 0.0

section .text
    finit               ; Initialize FPU
    fld qword [value1]  ; Push 3.14159 onto ST0
    fld qword [value2]  ; Push 2.71828 onto ST0 (3.14 moves to ST1)
    fadd                ; ST0 = ST0 + ST1, pop ST1
    fstp qword [result] ; Pop and store result

SSE Scalar Operations

                        
                        Modern Approach: SSE scalar operations use XMM registers directly (no stack), are faster, and integrate better with x86-64 calling conventions. Prefer SSE over x87 for new code.
                    

SSE XMM register diagram showing scalar float and double operations in the low lane — SSE scalar instructions (ADDSS, MULSD, etc.) operate only on the lowest lane of the 128-bit XMM register, leaving upper lanes unchanged

MOVSS & MOVSD

section .data
    fval dd 3.14        ; 32-bit float
    dval dq 3.14159     ; 64-bit double

section .text
    movss xmm0, [fval]  ; Load 32-bit float into XMM0
    movsd xmm1, [dval]  ; Load 64-bit double into XMM1
    
    movss [fval], xmm0  ; Store 32-bit float
    movsd [dval], xmm1  ; Store 64-bit double

Arithmetic: ADDSS, MULSS, etc.

; Scalar single-precision (32-bit)
addss xmm0, xmm1      ; XMM0 = XMM0 + XMM1 (low 32 bits)
subss xmm0, xmm1      ; XMM0 = XMM0 - XMM1
mulss xmm0, xmm1      ; XMM0 = XMM0 * XMM1
divss xmm0, xmm1      ; XMM0 = XMM0 / XMM1
sqrtss xmm0, xmm1     ; XMM0 = sqrt(XMM1)

; Scalar double-precision (64-bit)
addsd xmm0, xmm1      ; Double-precision add
mulsd xmm0, xmm1      ; Double-precision multiply

Conversion Instructions

; Integer to float
cvtsi2ss xmm0, eax    ; Convert int32 to float
cvtsi2sd xmm0, rax    ; Convert int64 to double

; Float to integer (truncate)
cvttss2si eax, xmm0   ; Convert float to int32 (truncate)
cvttsd2si rax, xmm0   ; Convert double to int64 (truncate)

; Float precision conversion
cvtss2sd xmm0, xmm1   ; Float to double
cvtsd2ss xmm0, xmm1   ; Double to float

Floating-Point Comparison

SSE provides comparison instructions that set CPU flags (like integer CMP):

Instruction	Operands	Sets Flags	NaN Handling
COMISS	xmm, xmm/m32	ZF, PF, CF	Raises #IA exception
COMISD	xmm, xmm/m64	ZF, PF, CF	Raises #IA exception
UCOMISS	xmm, xmm/m32	ZF, PF, CF	Quiet (no exception)
UCOMISD	xmm, xmm/m64	ZF, PF, CF	Quiet (no exception)

section .data
    pi    dq 3.14159
    e     dq 2.71828

section .text
    movsd xmm0, [pi]
    movsd xmm1, [e]
    
    ucomisd xmm0, xmm1    ; Compare pi vs e (quiet NaN handling)
    
    ; Use UNSIGNED condition codes (not JG/JL!):
    ja  .pi_greater       ; Jump if pi > e (Above)
    jb  .pi_less          ; Jump if pi < e (Below)
    je  .equal            ; Jump if pi == e
    jp  .unordered        ; Jump if either is NaN (Parity)

; Checking for NaN:
check_nan:
    ucomisd xmm0, xmm0    ; Compare value with itself
    jp .is_nan            ; NaN != NaN, sets PF=1
    ; Not NaN
.is_nan:
    ; Handle NaN case

                        
                        Critical: After floating-point compare, use unsigned branch instructions (JA, JB, JAE, JBE) not signed (JG, JL)! The flag encodings are different.
                    

Exercise: Max of Two Doubles

; max_double(xmm0, xmm1) -> xmm0
max_double:
    ucomisd xmm0, xmm1
    ja .done              ; If xmm0 > xmm1, already have max
    movsd xmm0, xmm1      ; Else xmm0 = xmm1
.done:
    ret

; Or use MAXSD instruction:
max_double_v2:
    maxsd xmm0, xmm1      ; xmm0 = max(xmm0, xmm1)
    ret

x87 vs SSE: When to Use

Aspect	x87 FPU	SSE/SSE2
Register Model	Stack (ST0-ST7)	Flat (XMM0-XMM15)
Precision	80-bit extended	32/64-bit only
SIMD Support	None	4 floats or 2 doubles
Calling Convention	Varies, complex	Clean (XMM0 returns)
Modern Use	Legacy code only	Preferred for new code
Transcendentals	Built-in (FSIN, FCOS)	None (use libraries)

Side-by-side comparison of x87 stack model versus SSE flat register model — x87 uses an 8-deep register stack with 80-bit precision, while SSE provides 16 flat XMM registers with 32/64-bit operations and SIMD potential

Decision Guide

Use SSE/SSE2: New code, performance-critical, ABI compliance, SIMD potential
Use x87: Need 80-bit precision, built-in transcendental functions, legacy code maintenance
Use AVX/AVX-512: When you need 256/512-bit vectors (see Part 12)

; Modern approach (SSE2) - Preferred!
add_doubles_sse:
    movsd xmm0, [value1]
    addsd xmm0, [value2]      ; xmm0 = value1 + value2
    movsd [result], xmm0
    ret

; Legacy approach (x87) - Avoid unless necessary
add_doubles_x87:
    fld qword [value1]        ; Push value1 to ST0
    fadd qword [value2]       ; ST0 = ST0 + value2
    fstp qword [result]       ; Pop ST0 to result
    ret

                        
                        Compiler Default: Modern compilers (GCC, Clang, MSVC) default to SSE for floating-point. x87 is only used when explicitly requested (-mfpmath=387) or for 80-bit long double.
                    

x86 Assembly Series Part 11: Floating-Point & x87/SSE

Table of Contents

IEEE 754 Representation

x86 Assembly Mastery

Development Environment, Tooling & Workflow

Assembly Language Fundamentals & Toolchain Setup

x86 CPU Architecture Overview

Registers – Complete Deep Dive

Instruction Encoding & Binary Layout

NASM Syntax, Directives & Macros

Complete Assembler Comparison

Memory Addressing Modes

Stack Internals & Calling Conventions

Control Flow & Procedures

Integer, Bitwise & Arithmetic Operations

Floating Point & SIMD Foundations

SIMD, Vectorization & Performance

System Calls, Interrupts & Privilege Transitions

Debugging & Reverse Engineering

Linking, Relocation & Loader Behavior

x86-64 Long Mode & Advanced Features

Assembly + C/C++ Interoperability

Memory Protection & Security Concepts

Bootloaders & Bare-Metal Programming

Kernel-Level Assembly

Complete Emulator & Simulator Guide

Advanced Optimization & CPU Internals

Real-World Assembly Projects

Assembly Mastery Capstone

Single Precision (32-bit float)

IEEE 754 Single Precision

Double Precision (64-bit double)

IEEE 754 Double Precision

Precision Comparison

x87 FPU (Legacy)

FPU Stack Model

FPU Operations

SSE Scalar Operations

MOVSS & MOVSD

Arithmetic: ADDSS, MULSS, etc.

Conversion Instructions

Floating-Point Comparison

Exercise: Max of Two Doubles

x87 vs SSE: When to Use

Decision Guide

Continue the Series

Part 10: Integer Arithmetic & Bitwise Operations

Part 12: SIMD – SSE, AVX, AVX-512

Part 22: Performance & Optimization