Series Overview: This is Part 16 of our 18-part STM32 Unleashed series. We now tackle the full spectrum of external storage — SD cards, QSPI NOR flash, wear levelling, and memory-mapped execution — skills that are indispensable for data loggers, firmware update systems, and resource-rich embedded applications.
1
Architecture & CubeMX Setup
STM32 family, clock tree, HAL vs LL, CubeMX workflow, first project
Completed
2
GPIO & Button Debounce
GPIO modes, pull-up/down, EXTI, software debounce, HAL_GPIO_ReadPin
Completed
3
UART Communication
Polling, interrupt, DMA modes, printf retargeting, ring buffers
Completed
4
Timers, PWM & Input Capture
TIM basics, PWM generation, input capture, encoder mode
Completed
5
ADC & DAC
Single/continuous conversion, DMA, injected channels, DAC waveforms
Completed
6
SPI Protocol
SPI master/slave, full-duplex, DMA transfers, sensor drivers
Completed
7
I2C Protocol
I2C master, 7/10-bit addressing, DMA, multi-master, error handling
Completed
8
DMA & Memory Efficiency
DMA streams, circular mode, memory-to-memory, zero-copy patterns
Completed
9
Interrupt Management & NVIC
Priority grouping, preemption, ISR design, HAL callbacks, latency
Completed
10
Low-Power Modes
Sleep, Stop, Standby modes, RTC wakeup, LP UART, power profiling
Completed
11
RTC & Calendar
RTC configuration, alarms, backup registers, calendar subseconds
Completed
12
CAN Bus
FDCAN/bxCAN, filters, message frames, error handling, automotive use
Completed
13
USB CDC Virtual COM Port
USB FS/HS, CDC class, virtual serial, control transfers, descriptors
Completed
14
FreeRTOS Integration
Tasks, queues, semaphores, mutexes, CMSIS-RTOS2 wrapper, stack sizing
Completed
15
Bootloader Development
Custom IAP bootloader, UART/USB DFU, flash programming, jump-to-app
Completed
16
External Storage: SD & QSPI Flash
FATFS on SD card, QSPI NOR flash, memory-mapped execution, wear levelling
You Are Here
17
Ethernet & TCP/IP Stack
LwIP integration, DHCP, TCP server, HTTP, MQTT, Ethernet DMA descriptors
18
Production Readiness
Watchdog, HardFault handler, flash option bytes, code signing, CI/CD
External Storage Options for STM32
Every serious embedded application eventually outgrows the internal flash of its microcontroller. Whether you need to store megabytes of calibration data, log thousands of sensor readings per second, hold large audio or font assets, or execute code that simply won't fit in the MCU's internal flash, you need external storage. The STM32 family supports a rich set of external storage interfaces — each with different capacity, speed, endurance, and cost trade-offs.
Understanding the full spectrum before committing to a design is critical. The wrong choice leads to either wasted cost (EEPROM used for logging) or catastrophic wear-out (NOR flash used as a swap partition). Here is the complete landscape:
- SD/microSD card (1 GB–2 TB): Removable, FAT32/exFAT filesystem, accessible from a PC out of the box. The STM32 SDMMC peripheral drives the card in 4-bit wide mode at 25 or 50 MHz. Alternatively, any SPI master can drive an SD card in 1-bit SPI mode, though throughput is significantly reduced.
- SPI NOR Flash (W25Q series, 1 MB–256 MB): Byte-addressable reads, page (256-byte) writes, sector (4 KB) erases. Typical endurance is 100,000 erase cycles per sector. The Winbond W25Q128JV (16 MB) is the most common choice for embedded applications worldwide.
- SPI NAND Flash: Higher density than NOR at lower cost per megabyte. Block-erase only (128 KB blocks), requires a wear-levelling layer and bad-block management. Not byte-addressable — unsuitable for XIP without an MMU.
- QSPI/OctoSPI NOR Flash: The same NOR flash cell accessed over a 4-bit (QSPI) or 8-bit (OctoSPI) parallel interface. The STM32 QUADSPI peripheral supports memory-mapped mode, allowing the CPU to read flash at 0x90000000 as if it were normal memory — enabling execute-in-place (XIP) for code and read-only data.
- EEPROM (I2C, SPI): Very small (typically 1 KB–512 KB), byte or word writable without erase, effectively unlimited write endurance (1,000,000+ cycles). Ideal for configuration data, calibration coefficients, and non-volatile counters — not for bulk data logging.
| Storage Type |
Capacity |
Interface |
Write Unit |
Erase Unit |
Endurance |
XIP? |
Relative Cost |
| microSD (SDMMC) |
1 GB–2 TB |
SDMMC 4-bit |
512-byte sector |
Cluster |
Managed internally |
No |
Low |
| microSD (SPI) |
1 GB–2 TB |
SPI 1-bit |
512-byte sector |
Cluster |
Managed internally |
No |
Low |
| SPI NOR Flash |
1 MB–256 MB |
SPI up to 104 MHz |
Page (256 B) |
Sector (4 KB) |
100k cycles/sector |
No (standard SPI) |
Medium |
| QSPI NOR Flash |
1 MB–256 MB |
QUADSPI 4-bit |
Page (256 B) |
Sector (4 KB) |
100k cycles/sector |
Yes |
Medium |
| OctoSPI NOR Flash |
Up to 512 MB |
OctoSPI 8-bit |
Page (256 B) |
Sector (4 KB) |
100k cycles/sector |
Yes |
High |
| SPI NAND Flash |
128 MB–8 GB |
SPI up to 120 MHz |
Page (2 KB) |
Block (128 KB) |
100k cycles/block |
No |
Low |
| I2C EEPROM |
1 KB–512 KB |
I2C up to 1 MHz |
Byte |
None (byte-erase) |
1M+ cycles |
No |
Low |
The code below shows the handle type declarations for both SDMMC and QSPI — the two interfaces we will use throughout this article:
/* ================================================================
* External Storage Handle Declarations
* SDMMC1 for microSD (4-bit, 25 MHz)
* QUADSPI for W25Q128JV NOR Flash (quad mode, 80 MHz)
* ================================================================ */
#include "stm32f4xx_hal.h"
#include "stm32f4xx_hal_sd.h"
#include "stm32f4xx_hal_qspi.h"
/* SDMMC handle — initialised by MX_SDIO_SD_Init() */
SD_HandleTypeDef hsd;
/* QUADSPI handle — initialised by MX_QUADSPI_Init() */
QSPI_HandleTypeDef hqspi;
/* CubeMX-generated SDMMC init (4-bit, 25 MHz, card type auto-detect) */
void MX_SDIO_SD_Init(void)
{
hsd.Instance = SDIO;
hsd.Init.ClockEdge = SDIO_CLOCK_EDGE_RISING;
hsd.Init.ClockBypass = SDIO_CLOCK_BYPASS_DISABLE;
hsd.Init.ClockPowerSave = SDIO_CLOCK_POWER_SAVE_DISABLE;
hsd.Init.BusWide = SDIO_BUS_WIDE_4B;
hsd.Init.HardwareFlowControl = SDIO_HARDWARE_FLOW_CONTROL_DISABLE;
hsd.Init.ClockDiv = 2; /* 48 MHz SDIO_CK / (2+2) = 12 MHz */
HAL_SD_Init(&hsd);
HAL_SD_ConfigWideBusOperation(&hsd, SDIO_BUS_WIDE_4B);
}
/* CubeMX-generated QUADSPI init for W25Q128JV (16 MB, 3-byte addr) */
void MX_QUADSPI_Init(void)
{
hqspi.Instance = QUADSPI;
hqspi.Init.ClockPrescaler = 1; /* 168 MHz AHB / (1+1) = 84 MHz */
hqspi.Init.FifoThreshold = 4;
hqspi.Init.SampleShifting = QSPI_SAMPLE_SHIFTING_HALFCYCLE;
hqspi.Init.FlashSize = 23; /* 2^(23+1) = 16 MB */
hqspi.Init.ChipSelectHighTime = QSPI_CS_HIGH_TIME_2_CYCLE;
hqspi.Init.ClockMode = QSPI_CLOCK_MODE_0;
hqspi.Init.FlashID = QSPI_FLASH_ID_1;
hqspi.Init.DualFlash = QSPI_DUALFLASH_DISABLE;
HAL_QSPI_Init(&hqspi);
}
SD Card with SDMMC & FATFS
The STM32 SDMMC (or SDIO on F4) peripheral implements the SD Host Controller Specification directly in hardware. When configured for 4-bit wide bus operation at 25 MHz, it delivers approximately 12.5 MB/s raw throughput — more than enough for audio recording, high-rate sensor logging, or firmware update image transfer. Compare this to the SPI SD card mode, which at 10 MHz SPI clock delivers roughly 1.25 MB/s in 1-bit mode.
The FATFS middleware, included in STM32Cube, sits between your application and the block-level hardware driver. FATFS was written by ChaN and is a de-facto standard for embedded FAT/exFAT implementations. STM32CubeMX generates the necessary diskio.c glue layer that connects FATFS to the HAL SD driver.
CubeMX Setup
- In the Pinout & Configuration view, enable SDIO (F4) or SDMMC1 (F7/H7).
- Set Bus Width to 4 bits and the clock divider to achieve 25 MHz or lower.
- Under Middleware → FATFS, enable it and select SD Card as the physical drive.
- In FATFS configuration, set
_USE_LFN to 1 (heap) for long filename support, _VOLUMES to 1, and _FS_TINY to 0 for best performance.
- Enable DMA for SDIO/SDMMC (DMA2 Stream 3/6 for F4) to avoid CPU-blocking block transfers.
FATFS Core API
The essential FATFS functions your application will call:
f_mount(&fs, "0:", 1) — mount the filesystem on logical drive 0. The second argument is the path, the third forces immediate mounting.
f_open(&fil, "path/file.txt", mode) — open or create a file. Returns FR_OK on success.
f_write(&fil, buf, len, &bw) — write len bytes from buf; bw receives actual bytes written.
f_read(&fil, buf, len, &br) — read up to len bytes into buf.
f_close(&fil) — flush write buffers and release the file object.
f_mkdir("logs") — create a directory.
f_sync(&fil) — flush without closing, for periodic safety-flush during long logging sessions.
/* ================================================================
* SD Card Init via FATFS + Write 10 sensor readings to CSV
* Requires: FATFS middleware enabled, SDMMC DMA configured
* ================================================================ */
#include "fatfs.h"
#include <stdio.h>
#include <string.h>
FATFS SDFatFS; /* FATFS work area for logical drive 0 */
FIL SDFile; /* File object */
FRESULT fres; /* FATFS return code */
void sd_write_sensor_data(void)
{
char line[64];
UINT bytes_written;
uint32_t tick_start;
/* Mount the filesystem — forces immediate card detection */
fres = f_mount(&SDFatFS, (TCHAR const*)"0:", 1);
if (fres != FR_OK) {
printf("f_mount failed: %d\r\n", fres);
return;
}
/* Create (or overwrite) data.csv on the SD card root */
fres = f_open(&SDFile, "0:/data.csv",
FA_CREATE_ALWAYS | FA_WRITE);
if (fres != FR_OK) {
printf("f_open failed: %d\r\n", fres);
f_mount(NULL, "0:", 0);
return;
}
/* Write CSV header */
const char *header = "tick_ms,adc_raw,voltage_mV\r\n";
f_write(&SDFile, header, strlen(header), &bytes_written);
/* Append 10 sensor readings */
for (int i = 0; i < 10; i++) {
uint32_t tick = HAL_GetTick();
uint32_t adc = HAL_ADC_GetValue(&hadc1);
uint32_t mv = (adc * 3300UL) / 4095UL;
snprintf(line, sizeof(line),
"%lu,%lu,%lu\r\n", tick, adc, mv);
fres = f_write(&SDFile, line, strlen(line), &bytes_written);
if (fres != FR_OK || bytes_written != strlen(line)) {
printf("Write error at record %d: %d\r\n", i, fres);
break;
}
HAL_Delay(100); /* 100 ms between samples */
}
/* Close file — flushes and updates directory entry */
f_close(&SDFile);
/* Unmount — good practice before power-off */
f_mount(NULL, "0:", 0);
printf("SD write complete\r\n");
}
DMA and DCACHE Warning (H7): On STM32H7, the D-cache is enabled by default. DMA operates on physical memory, bypassing the cache. Any buffer used for DMA transfers must be placed in a non-cached memory region (e.g., SRAM4 at 0x38000000) or must be explicitly invalidated after a DMA receive and cleaned before a DMA transmit. Failure to do this causes subtle data corruption that looks like random bit errors in your SD card files.
FATFS File Operations
FATFS exposes a POSIX-like file API. Mastering the open flags, error codes, and random-access functions gives you the flexibility to implement anything from a simple data logger to a structured filesystem application.
Open Flags
The f_open mode argument is a bitfield of the following flags:
FA_READ — open for reading. The file must exist.
FA_WRITE — open for writing. Can be combined with FA_READ for read/write.
FA_CREATE_ALWAYS — create a new file; if it exists, truncate it to zero length.
FA_CREATE_NEW — create a new file; fail if it already exists.
FA_OPEN_EXISTING — open existing file (default if no create flag given).
FA_OPEN_APPEND — open existing file and seek to the end before writing. Equivalent to fopen("a").
Useful Utility Functions
f_printf(&fil, fmt, ...) — formatted write, like fprintf. Requires _USE_STRFUNC >= 1.
f_lseek(&fil, offset) — move file pointer to absolute byte offset. Use f_size(&fil) to seek to end.
f_stat("path", &fno) — get file/directory metadata (size, date, attributes).
f_getfree("0:", &fre_clust, &pfs) — query free clusters. Multiply by cluster size to get free bytes.
f_unlink("path") — delete a file or empty directory.
f_rename("old", "new") — rename/move a file.
FATFS Error Codes
Always check the return value of every FATFS call. Production code must handle all failure modes gracefully:
| Error Code |
Value |
Meaning |
Typical Cause & Response |
FR_OK |
0 |
Success |
Operation completed normally |
FR_DISK_ERR |
1 |
Hard disk error |
DMA error, CRC failure — retry once, then unmount |
FR_INT_ERR |
2 |
Internal assertion failed |
FATFS work area corrupted — re-mount |
FR_NOT_READY |
3 |
Drive not ready |
Card removed or not inserted — check CD pin |
FR_NO_FILE |
4 |
File not found |
Wrong path — check directory and filename spelling |
FR_NO_PATH |
5 |
Path not found |
Intermediate directory missing — call f_mkdir first |
FR_DENIED |
7 |
Access denied |
Write-protect tab engaged, or read-only file attribute |
FR_NO_SPACE |
20 |
Volume full |
Not enough free clusters — call f_getfree before logging |
/* ================================================================
* Production Data Logger: RTC timestamp + ADC to CSV
* Flushes every 10 records; handles FATFS errors with retry
* ================================================================ */
#include "fatfs.h"
#include "rtc.h"
#include <stdio.h>
#include <string.h>
#define LOG_FLUSH_INTERVAL 10 /* f_sync every N records */
#define LOG_MAX_RETRIES 3
static FATFS fs;
static FIL logfile;
static uint32_t record_count = 0;
static FRESULT logger_open(void)
{
FRESULT fr;
/* Create "logs" directory if not present */
fr = f_mkdir("0:/logs");
if (fr != FR_OK && fr != FR_EXIST) return fr;
/* Append to existing log or create new one */
fr = f_open(&logfile, "0:/logs/sensor.csv",
FA_OPEN_APPEND | FA_WRITE);
if (fr == FR_NO_FILE) {
/* File doesn't exist yet — create with header */
fr = f_open(&logfile, "0:/logs/sensor.csv",
FA_CREATE_ALWAYS | FA_WRITE);
if (fr == FR_OK) {
f_printf(&logfile,
"date,time,adc_raw,voltage_mV,temperature_C\r\n");
}
}
return fr;
}
void logger_append_record(void)
{
RTC_TimeTypeDef t;
RTC_DateTypeDef d;
UINT bw;
char line[80];
FRESULT fr;
int retries = 0;
HAL_RTC_GetTime(&hrtc, &t, RTC_FORMAT_BIN);
HAL_RTC_GetDate(&hrtc, &d, RTC_FORMAT_BIN);
uint32_t adc = HAL_ADC_GetValue(&hadc1);
uint32_t mv = (adc * 3300UL) / 4095UL;
int32_t temp = (int32_t)((mv - 760) / 2.5f + 25);
snprintf(line, sizeof(line),
"20%02d-%02d-%02d,%02d:%02d:%02d,%lu,%lu,%ld\r\n",
d.Year, d.Month, d.Date,
t.Hours, t.Minutes, t.Seconds,
adc, mv, temp);
retry:
fr = f_write(&logfile, line, strlen(line), &bw);
if (fr != FR_OK && retries++ < LOG_MAX_RETRIES) {
/* Re-mount and reopen on error */
f_close(&logfile);
f_mount(NULL, "0:", 0);
HAL_Delay(20);
f_mount(&fs, "0:", 1);
logger_open();
goto retry;
}
record_count++;
if (record_count % LOG_FLUSH_INTERVAL == 0) {
f_sync(&logfile); /* Safety flush — survive power loss */
}
}
QSPI NOR Flash
The QUADSPI peripheral on STM32F4/F7/H7 provides a 4-bit parallel SPI interface that delivers up to 4x the throughput of standard SPI at the same clock frequency. With a 84 MHz QUADSPI clock and quad I/O mode, the effective read bandwidth reaches ~42 MB/s — sufficient to execute code directly from the flash chip.
The Winbond W25Q128JV is a 16 MB (128 Mbit) NOR flash device in SOIC-8 or WSON-8 package. It is pin-compatible with the entire W25Q family (W25Q16 through W25Q256) and supports all standard SPI commands plus an extended quad I/O mode. Key specifications: 3.3 V supply, 133 MHz max clock, 256-byte page program, 4 KB sector erase, 100,000 erase cycles per sector, data retention 20 years.
Critical W25Q128JV Command Bytes
0x06 — Write Enable (WREN). Must precede every program or erase command.
0x04 — Write Disable (WRDI). Sent automatically after page program completes.
0x05 — Read Status Register 1 (RDSR1). Bit 0 = BUSY — poll until clear after erase.
0x35 — Read Status Register 2. Bit 1 = QE (Quad Enable) — must be set for quad mode.
0x40 — Write Status Register 2 to set QE bit (enable quad I/O).
0x9F — Read JEDEC ID: returns manufacturer (0xEF), memory type (0x40), capacity (0x18 for 16 MB).
0xEB — Fast Read Quad I/O (4-4-4 mode): address and data on all 4 pins.
0x02 — Page Program (1-1-1): up to 256 bytes per call.
0x20 — Sector Erase (4 KB): sets all bits in the addressed sector to 1.
0xD8 — Block Erase (64 KB): erases a 64 KB block in ~1 second.
0xC7 — Chip Erase: erases the entire device (~40 seconds for 16 MB).
/* ================================================================
* W25Q128JV QSPI Driver — Read JEDEC, Sector Erase,
* Page Program, and Quad Fast Read
* ================================================================ */
#define W25Q_CMD_JEDEC_ID 0x9F
#define W25Q_CMD_WRITE_EN 0x06
#define W25Q_CMD_READ_SR1 0x05
#define W25Q_CMD_SECTOR_ERASE 0x20
#define W25Q_CMD_PAGE_PROG 0x02
#define W25Q_CMD_FAST_READ_QIO 0xEB
#define W25Q_TIMEOUT_MS 5000
/* Read JEDEC ID: returns 0x00EF4018 for W25Q128JV */
uint32_t qspi_read_jedec(void)
{
QSPI_CommandTypeDef cmd = {0};
uint8_t buf[3];
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_JEDEC_ID;
cmd.DataMode = QSPI_DATA_1_LINE;
cmd.NbData = 3;
cmd.AddressMode = QSPI_ADDRESS_NONE;
cmd.DummyCycles = 0;
HAL_QSPI_Command(&hqspi, &cmd, W25Q_TIMEOUT_MS);
HAL_QSPI_Receive(&hqspi, buf, W25Q_TIMEOUT_MS);
return ((uint32_t)buf[0] << 16) |
((uint32_t)buf[1] << 8) |
(uint32_t)buf[2];
}
/* Poll BUSY bit in Status Register 1 until clear */
static HAL_StatusTypeDef qspi_wait_idle(uint32_t timeout_ms)
{
QSPI_CommandTypeDef cmd = {0};
QSPI_AutoPollingTypeDef cfg = {0};
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_READ_SR1;
cmd.DataMode = QSPI_DATA_1_LINE;
cfg.Match = 0x00; /* BUSY = 0 */
cfg.Mask = 0x01; /* Check bit 0 only */
cfg.MatchMode = QSPI_MATCH_MODE_AND;
cfg.Interval = 0x10;
cfg.AutomaticStop = QSPI_AUTOMATIC_STOP_ENABLE;
cfg.StatusBytesSize = 1;
return HAL_QSPI_AutoPolling(&hqspi, &cmd, &cfg, timeout_ms);
}
/* Write Enable — must call before every erase or program */
static void qspi_write_enable(void)
{
QSPI_CommandTypeDef cmd = {0};
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_WRITE_EN;
HAL_QSPI_Command(&hqspi, &cmd, W25Q_TIMEOUT_MS);
}
/* Erase one 4 KB sector at the given 24-bit address */
HAL_StatusTypeDef qspi_sector_erase(uint32_t addr)
{
QSPI_CommandTypeDef cmd = {0};
qspi_write_enable();
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_SECTOR_ERASE;
cmd.AddressMode = QSPI_ADDRESS_1_LINE;
cmd.AddressSize = QSPI_ADDRESS_24_BITS;
cmd.Address = addr & ~0xFFFUL; /* Align to 4 KB */
cmd.DataMode = QSPI_DATA_NONE;
HAL_QSPI_Command(&hqspi, &cmd, W25Q_TIMEOUT_MS);
return qspi_wait_idle(W25Q_TIMEOUT_MS); /* Erase ~50 ms */
}
/* Program up to 256 bytes at addr (must be erased first) */
HAL_StatusTypeDef qspi_page_program(uint32_t addr,
const uint8_t *data,
uint16_t len)
{
QSPI_CommandTypeDef cmd = {0};
if (len == 0 || len > 256) return HAL_ERROR;
qspi_write_enable();
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_PAGE_PROG;
cmd.AddressMode = QSPI_ADDRESS_1_LINE;
cmd.AddressSize = QSPI_ADDRESS_24_BITS;
cmd.Address = addr;
cmd.DataMode = QSPI_DATA_1_LINE;
cmd.NbData = len;
HAL_QSPI_Command(&hqspi, &cmd, W25Q_TIMEOUT_MS);
HAL_QSPI_Transmit(&hqspi, (uint8_t*)data, W25Q_TIMEOUT_MS);
return qspi_wait_idle(W25Q_TIMEOUT_MS); /* Program ~0.7 ms */
}
/* Quad Fast Read — 4-bit data, 2 dummy cycles after address */
HAL_StatusTypeDef qspi_read_data(uint32_t addr,
uint8_t *buf,
uint32_t len)
{
QSPI_CommandTypeDef cmd = {0};
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = W25Q_CMD_FAST_READ_QIO;
cmd.AddressMode = QSPI_ADDRESS_4_LINES;
cmd.AddressSize = QSPI_ADDRESS_24_BITS;
cmd.Address = addr;
cmd.DataMode = QSPI_DATA_4_LINES;
cmd.NbData = len;
cmd.DummyCycles = 4; /* W25Q128JV requires 4 dummy cycles in QPI */
HAL_QSPI_Command(&hqspi, &cmd, W25Q_TIMEOUT_MS);
return HAL_QSPI_Receive(&hqspi, buf, W25Q_TIMEOUT_MS);
}
Memory-Mapped (XIP) Mode
One of the most powerful features of the STM32 QUADSPI peripheral is memory-mapped mode — also called XIP (execute-in-place). In this mode, the CPU accesses the QSPI flash as if it were normal read-only memory, starting at address 0x90000000 on STM32F4/F7 or 0x90000000/0x70000000 on H7. The QUADSPI hardware transparently issues read commands to the flash chip whenever the CPU (or DMA) reads from that address range.
Practical applications of XIP mode:
- Large lookup tables: Sine/cosine tables for motor control, gamma correction tables for displays, dithering matrices for audio DACs.
- Font and graphics assets: Bitmap fonts, icons, UI images — particularly useful for colour TFT display projects where the internal flash would be entirely consumed by assets.
- Audio samples: WAV or raw PCM data for tone generation or speech synthesis.
- Executable code: The linker can place entire functions or modules in a
.text_qspi section and execute them directly from the flash — effectively giving you 16 MB of addressable program memory. Be aware of the fetch latency (a few extra cycles per cache miss) and ensure functions are not called from ISRs where deterministic latency is required.
/* ================================================================
* Enable QUADSPI Memory-Mapped Mode (XIP)
* Maps W25Q128JV to 0x90000000 on STM32F4/F7
* ================================================================ */
#define QSPI_XIP_BASE 0x90000000UL
HAL_StatusTypeDef qspi_enable_memory_mapped(void)
{
QSPI_CommandTypeDef cmd = {0};
QSPI_MemoryMappedTypeDef cfg = {0};
/* Configure the read command used by the memory-mapped engine */
cmd.InstructionMode = QSPI_INSTRUCTION_1_LINE;
cmd.Instruction = 0xEB; /* Fast Read Quad I/O */
cmd.AddressMode = QSPI_ADDRESS_4_LINES;
cmd.AddressSize = QSPI_ADDRESS_24_BITS;
cmd.AlternateByteMode = QSPI_ALTERNATE_BYTES_4_LINES;
cmd.AlternateBytes = 0xFF; /* Continuous mode pattern */
cmd.AlternateBytesSize= QSPI_ALTERNATE_BYTES_8_BITS;
cmd.DummyCycles = 4;
cmd.DataMode = QSPI_DATA_4_LINES;
cmd.DdrMode = QSPI_DDR_MODE_DISABLE;
cmd.SIOOMode = QSPI_SIOO_INST_ONLY_FIRST_CMD;
/* No timeout — memory-mapped mode runs indefinitely */
cfg.TimeOutActivation = QSPI_TIMEOUT_COUNTER_DISABLE;
HAL_StatusTypeDef ret =
HAL_QSPI_MemoryMapped(&hqspi, &cmd, &cfg);
if (ret != HAL_OK) {
printf("XIP enable failed: %d\r\n", ret);
return ret;
}
printf("XIP active — flash mapped at 0x%08lX\r\n",
(uint32_t)QSPI_XIP_BASE);
return HAL_OK;
}
/* Example: read 1 KB of data through the XIP memory map */
void xip_read_demo(void)
{
/* After qspi_enable_memory_mapped(), just dereference the pointer */
const uint8_t *flash_ptr = (const uint8_t *)QSPI_XIP_BASE;
uint32_t checksum = 0;
for (uint32_t i = 0; i < 1024; i++) {
checksum += flash_ptr[i]; /* CPU reads, QSPI fetches automatically */
}
printf("XIP 1 KB checksum: 0x%08lX\r\n", checksum);
/* Function pointer example: call a function stored in QSPI flash
* (function must be placed in .text_qspi section by linker) */
typedef void (*qspi_func_t)(void);
qspi_func_t qspi_entry =
(qspi_func_t)(QSPI_XIP_BASE | 0x00000001UL); /* Thumb bit */
/* qspi_entry(); -- uncomment if you have code flashed there */
}
XIP and DCache on H7: On STM32H7, if the D-cache is enabled, configure the MPU to mark the QSPI region (0x90000000–0x9FFFFFFF) as Normal, Non-Cacheable or Write-Through to prevent stale cache entries when the flash content is updated. For code execution (I-cache), this is handled automatically by the hardware cache coherency of the Cortex-M7 instruction fetch path.
Wear-Levelling for Data Logging
NOR flash endurance is finite: the W25Q128JV guarantees 100,000 erase cycles per 4 KB sector. For a data logger that appends to a single sector every second, that sector would reach end-of-life in about 27 hours. The solution is circular (round-robin) logging: distribute writes across many sectors so that each sector is erased only once every N records, where N is the number of sectors in your log ring.
With 128 sectors (512 KB log area), each sector experiences only 1/128 of the total erase count. A 100 Hz logger writing 128-byte records would take over 350 years to exhaust any single sector — comfortably within product lifetime.
Sector Header Design
Each sector in the ring carries a small header that allows the firmware to find the current write position after power-cycling, without scanning every byte of flash:
- Magic word (4 bytes): e.g.,
0xDEADC0DE. A blank (erased) sector has 0xFFFFFFFF — instantly distinguishable.
- Sequence number (4 bytes): Monotonically increasing 32-bit counter. The sector with the highest sequence number in the ring is the most recently written one.
- Record count (2 bytes): How many records are currently stored in this sector.
- CRC (2 bytes): CRC-16 over the header — detects a partially written header from a power loss mid-erase.
/* ================================================================
* Circular QSPI Log — wear-levelled logging with sequence numbers
* Log area: 128 sectors × 4 KB = 512 KB starting at QSPI addr 0
* Record size: 64 bytes (fixed)
* ================================================================ */
#define LOG_SECTOR_SIZE 4096U
#define LOG_SECTOR_COUNT 128U
#define LOG_RECORD_SIZE 64U
#define LOG_MAGIC 0xDEADC0DEUL
#define LOG_HEADER_SIZE 12U /* magic(4) + seq(4) + reccount(2) + crc(2) */
#define LOG_RECS_PER_SEC ((LOG_SECTOR_SIZE - LOG_HEADER_SIZE) / LOG_RECORD_SIZE)
typedef struct {
uint32_t magic;
uint32_t sequence;
uint16_t record_count;
uint16_t header_crc;
} LogSectorHeader;
static uint32_t current_sector = 0;
static uint32_t current_seq = 0;
static uint16_t current_rec = 0;
/* Find the sector with the highest sequence number (power-on resume) */
uint32_t circular_log_find_latest(void)
{
uint32_t best_seq = 0;
uint32_t best_sector = 0;
LogSectorHeader hdr;
for (uint32_t s = 0; s < LOG_SECTOR_COUNT; s++) {
uint32_t addr = s * LOG_SECTOR_SIZE;
qspi_read_data(addr, (uint8_t*)&hdr, sizeof(hdr));
if (hdr.magic == LOG_MAGIC && hdr.sequence > best_seq) {
best_seq = hdr.sequence;
best_sector = s;
}
}
/* Restore state from found sector */
uint32_t addr = best_sector * LOG_SECTOR_SIZE;
qspi_read_data(addr, (uint8_t*)&hdr, sizeof(hdr));
current_sector = best_sector;
current_seq = hdr.sequence;
current_rec = hdr.record_count;
printf("Resume: sector=%lu seq=%lu rec=%u\r\n",
best_sector, best_seq, current_rec);
return best_sector;
}
/* Write one 64-byte record to the circular log */
HAL_StatusTypeDef circular_log_write(const uint8_t *record)
{
/* If current sector is full, advance to next sector */
if (current_rec >= LOG_RECS_PER_SEC) {
current_sector = (current_sector + 1) % LOG_SECTOR_COUNT;
current_seq++;
current_rec = 0;
/* Erase new sector */
uint32_t erase_addr = current_sector * LOG_SECTOR_SIZE;
if (qspi_sector_erase(erase_addr) != HAL_OK) {
return HAL_ERROR;
}
/* Write sector header */
LogSectorHeader hdr = {
.magic = LOG_MAGIC,
.sequence = current_seq,
.record_count = 0,
.header_crc = 0 /* TODO: compute CRC-16 */
};
qspi_page_program(erase_addr,
(uint8_t*)&hdr, sizeof(hdr));
}
/* Calculate address of this record within the sector */
uint32_t rec_addr = (current_sector * LOG_SECTOR_SIZE)
+ LOG_HEADER_SIZE
+ (current_rec * LOG_RECORD_SIZE);
/* Program the record */
HAL_StatusTypeDef ret =
qspi_page_program(rec_addr, record, LOG_RECORD_SIZE);
if (ret != HAL_OK) return ret;
/* Update record count in sector header */
current_rec++;
uint16_t count = current_rec;
uint32_t count_addr = (current_sector * LOG_SECTOR_SIZE)
+ offsetof(LogSectorHeader, record_count);
/* NOR flash bits can only be cleared (0); incrementing count
* requires a full sector erase — use a dedicated count page or
* bitfield scheme in production. This simplified version assumes
* the count field is written only once per sector. */
return HAL_OK;
}
FatFS on QSPI Flash
For applications that need a proper filesystem on QSPI NOR flash — rather than the custom circular log above — there are two practical options:
LittleFS (Recommended for NOR Flash)
LittleFS is an open-source embedded filesystem designed specifically for NOR flash by ARM Research. It provides:
- Built-in wear levelling: Dynamic wear-levelling distributes erasures automatically without any application-level sector management.
- Power-loss resilience: Copy-on-write (COW) metadata updates guarantee filesystem consistency even on sudden power loss.
- Small footprint: ~4 KB RAM, configurable block size, works with 4 KB NOR sectors.
- Simple port: Provide four function pointers (read, prog, erase, sync) pointing to your QSPI driver — that's the entire HAL layer.
The LittleFS port to QSPI requires mapping five function pointers in the lfs_config structure:
lfs_cfg.read → qspi_read_data()
lfs_cfg.prog → qspi_page_program()
lfs_cfg.erase → qspi_sector_erase()
lfs_cfg.sync → a no-op for NOR flash (no write buffer to flush)
lfs_cfg.read_size / prog_size / block_size = 256 / 256 / 4096
FATFS Diskio Layer on QSPI
If you need FAT compatibility (files readable by a PC after extraction, or compatibility with existing FATFS-based code), you can implement the FATFS diskio layer on top of your QSPI driver. The key consideration is that FATFS does not understand NOR flash erase semantics — a sector write always requires erase-then-program. The disk_write() implementation must:
- Read the 4 KB NOR sector into a RAM buffer.
- Modify the 512-byte FATFS logical sector within that buffer.
- Erase the 4 KB NOR sector.
- Re-program all 4 KB from the RAM buffer.
This read-modify-erase-write cycle requires 4 KB of RAM for the sector buffer and inflicts one extra erase per FATFS sector write. For small NOR flash devices used as configuration storage (not bulk logging), this is acceptable. For high-write applications, LittleFS or the custom circular log is preferable.
Choosing the Right Approach: Use SD card + FATFS for large files and PC interoperability. Use LittleFS on QSPI for wear-safe embedded storage. Use the circular log for high-throughput, fixed-size records. Use FATFS on QSPI only when you specifically need FAT compatibility on NOR flash and writes are infrequent.
Exercises
Exercise 1
Beginner
Mount an SD Card and Read/Write a File
Mount a FAT32 microSD card using SDMMC (4-bit mode). Create a file called test.txt, write the string "Hello SD Card!", close the file, re-open it for reading, read it back, and print the contents over UART. Use f_getfree() to display the available space in megabytes. Verify the file is visible on a PC when the card is removed and inserted into a card reader.
FATFS
SDMMC
f_open
f_getfree
Exercise 2
Intermediate
Triggered High-Rate Datalogger
Build a triggered datalogger. On the first button press, start recording ADC readings at 10 kHz to an SD card file (adc_log.bin), using DMA-driven ADC and a double-buffer scheme so that writing to the SD card does not interrupt the ADC stream. On the second button press, stop recording and close the file. Compute and store a CRC32 checksum of the entire file in the last 4 bytes. Verify the file and checksum integrity with a Python script on a PC.
DMA Double Buffer
ADC 10 kHz
CRC32
Python Verification
Exercise 3
Advanced
Power-Loss–Safe Circular QSPI Log
Connect a W25Q128JV QSPI flash. Implement the wear-levelled circular log from Section 6, extended with: (a) CRC-16 over the sector header to detect partial writes, (b) a sector sequence that wraps correctly at sector 127 → 0, and (c) a verified power-loss test: power-cycle the STM32 mid-write (pull VDD while writing sector header), then verify on restart that no records are missing and the next write continues from the correct position. Log sensor data + RTC timestamp in each 64-byte record. Verify 10,000 records written and read back correctly with zero corruption.
QSPI
Wear Levelling
CRC-16
Power-Loss Safe
STM32 Storage Configuration Tool
Use this tool to document your external storage design — storage type selection, SD card bus configuration, QSPI chip details, FATFS settings, and wear-levelling strategy. Download as Word, Excel, PDF, or PPTX for project documentation or design review.
Conclusion & Next Steps
In this article we have built a complete toolkit for STM32 external storage:
- Storage landscape: The seven major external storage options — microSD (SDMMC/SPI), SPI NOR flash, QSPI NOR flash, OctoSPI flash, SPI NAND, and I2C EEPROM — were compared across capacity, interface speed, write/erase granularity, endurance, XIP capability, and cost. Choose based on your application's read/write pattern, not just capacity.
- SDMMC + FATFS: The SDMMC peripheral with 4-bit DMA gives ~12 MB/s throughput. FATFS provides a robust FAT32/exFAT layer with a POSIX-like API. The combination is ideal for field-deployable data loggers where SD cards are removed for PC-side analysis.
- FATFS file operations: Mastering open flags, error codes,
f_sync() for power-loss safety, and f_getfree() for capacity monitoring transforms your code from a demo into a production-quality logger.
- QSPI NOR flash driver: The
HAL_QSPI_Command / HAL_QSPI_Transmit / HAL_QSPI_Receive trio drives the complete W25Q128JV command set. Auto-polling for the BUSY bit eliminates CPU-blocking spin loops.
- XIP memory-mapped mode:
HAL_QSPI_MemoryMapped() maps 16 MB of NOR flash to the CPU's address space, enabling direct pointer access to lookup tables, font data, audio assets, and even executable code without explicit read commands.
- Wear-levelled circular log: Distributing erase cycles across N sectors via a sequence-number ring extends NOR flash lifetime from hours to decades. Sector headers with magic words and sequence numbers enable instant power-on resume without a full scan.
- Filesystem selection: LittleFS is the right choice for a proper filesystem on NOR flash — built-in wear levelling, power-loss safety, and a compact footprint make it the industry standard for embedded non-volatile storage.
Next in the Series
In Part 17: Ethernet & TCP/IP Stack, we will wire up an external Ethernet PHY (LAN8720A), integrate LwIP, configure DHCP, build a UDP telemetry sender, implement a TCP server and HTTP endpoint, and publish sensor data over MQTT — transforming your STM32 into a fully networked embedded node.
Related Articles in This Series
Part 17: Ethernet & TCP/IP Stack
LwIP integration, DHCP client, UDP telemetry, TCP server, HTTP endpoint, and MQTT sensor publishing on STM32.
Read Article
Part 15: Bootloader Development
Custom IAP bootloader over UART and USB DFU, flash programming, image validation, and jump-to-application sequences.
Read Article
Part 8: DMA & Memory Efficiency
DMA streams, circular buffer mode, memory-to-memory transfers, and zero-copy data pipelines for high-throughput peripherals.
Read Article