Series Context: This is Part 2 of the 20-part CMSIS Mastery Series. Part 1 established the ecosystem overview; here we go deep into the processor-level API that underpins every CMSIS project.
1
Overview & ARM Cortex-M Ecosystem
CMSIS layers, Cortex-M families, memory map, toolchains
Completed
2
CMSIS-Core: Registers, NVIC & SysTick
core_cmX.h, register access, interrupt controller, SysTick timer
You Are Here
3
Startup Code, Linker Scripts & Vector Table
Reset handler, BSS init, scatter files, boot process
4
CMSIS-RTOS2: Threads, Mutexes & Semaphores
Thread management, synchronization primitives, scheduling
5
CMSIS-RTOS2: Message Queues & Event Flags
Inter-thread comms, ISR-to-thread, real-time design patterns
6
CMSIS-DSP: Filters, FFT & Math Functions
FIR/IIR filters, FFT, SIMD optimizations
7
CMSIS-Driver: UART, SPI & I2C
Driver abstraction layer, callbacks, DMA integration
8
CMSIS-Pack & Software Components
Pack files, device support, dependency management
9
Debugging with CMSIS-DAP & CoreSight
SWD/JTAG, HardFault analysis, ITM tracing
10
Portable Firmware: Multi-Vendor Projects
HAL vs CMSIS, cross-platform BSPs, reusable libraries
11
Interrupts, Concurrency & Real-Time Constraints
Interrupt latency, critical sections, lock-free programming
12
Memory Management in Embedded Systems
Static vs dynamic, heap fragmentation, memory pools
13
Low Power & Energy Optimization
Sleep modes, clock gating, tickless RTOS, power profiling
14
DMA & High-Performance Data Handling
DMA basics, peripheral transfers, zero-copy techniques
15
Security: ARMv8-M & TrustZone
Secure/non-secure worlds, secure boot, firmware protection
16
Bootloaders & Firmware Updates
OTA updates, dual-bank flash, fail-safe strategies
17
Testing & Validation
Unity/Ceedling unit tests, HIL testing, integration testing
18
Performance Optimization
Compiler flags, inline assembly, cache (M7/M33), profiling
19
Embedded Software Architecture
Layered design, event-driven, state machines, component-based
20
Tooling & Workflow (Professional Level)
CI/CD for embedded, MISRA, static analysis, Doxygen
CMSIS-Core Structure
CMSIS-Core is entirely header-based — there is no library to link. Every function you call is either a macro expansion or an __STATIC_INLINE function that the compiler inlines at the call site, generating zero function-call overhead. The entry point for any project is including the device header, which pulls in the appropriate core header automatically.
The header hierarchy is well-defined: your device header (e.g., stm32f407xx.h) includes the processor-specific core header, which includes the compiler-specific intrinsics header. You never include core headers directly.
| Header File |
Target Core(s) |
ISA |
Key Additions Over Previous |
core_cm0.h |
Cortex-M0 |
ARMv6-M |
Base NVIC (8 IRQs), SysTick, SCB — minimal subset |
core_cm0plus.h |
Cortex-M0+ |
ARMv6-M |
MPU support, vector table offset register (VTOR) |
core_cm3.h |
Cortex-M3 |
ARMv7-M |
Full NVIC (240 IRQs), DWT, ITM, FPB, bit-band, UDIV |
core_cm4.h |
Cortex-M4 / M4F |
ARMv7E-M |
DSP intrinsics (__QADD, __SMULL), optional FPU registers |
core_cm7.h |
Cortex-M7 |
ARMv7E-M |
Double-precision FPU, L1 cache maintenance (SCB_CleanDCache) |
core_cm23.h |
Cortex-M23 |
ARMv8-M Base |
TrustZone SAU registers, IDAU, secure/non-secure NVIC banks |
core_cm33.h |
Cortex-M33 |
ARMv8-M Main |
Full TrustZone + DSP + optional FPU, MSPLIM/PSPLIM |
core_cm55.h |
Cortex-M55 |
ARMv8.1-M |
Helium (MVE) vector intrinsics, 4-stage pipeline support |
core_cm85.h |
Cortex-M85 |
ARMv8.1-M |
OoO execution hints, Pacbti (PAC/BTI) pointer authentication |
Beyond the core-specific header, every CMSIS-Core installation includes three cross-cutting headers:
cmsis_compiler.h — maps compiler-agnostic macros (__STATIC_INLINE, __PACKED, __WEAK) to GCC, ARMClang, and IAR equivalents.
cmsis_version.h — exposes __CM_CMSIS_VERSION and __CM_CMSIS_VERSION_MAIN for compile-time version checks.
mpu_armv7.h / mpu_armv8.h — MPU configuration helpers for the respective ISA generations.
Key Macros: __IO, __IM, __OM, and Memory Barriers
/*
* CMSIS-Core qualifier macros (defined in cmsis_compiler.h)
*
* __IO volatile — read/write peripheral register
* __IM volatile const — read-only peripheral register (Input)
* __OM volatile — write-only peripheral register (Output)
*
* These are the qualifiers used on every field in the peripheral structs
* generated from SVD files.
*/
/* Example from core_cm4.h — SysTick register structure */
typedef struct {
__IOM uint32_t CTRL; /*!< SysTick Control and Status Register */
__IOM uint32_t LOAD; /*!< SysTick Reload Value Register */
__IOM uint32_t VAL; /*!< SysTick Current Value Register */
__IM uint32_t CALIB; /*!< SysTick Calibration Register */
} SysTick_Type;
/* __STATIC_INLINE: expands to 'static inline' on GCC/Clang,
* '__inline' on IAR — ensures zero-overhead helper functions */
__STATIC_INLINE uint32_t SysTick_Config(uint32_t ticks) {
if ((ticks - 1UL) > SysTick_LOAD_RELOAD_Msk) { return 1UL; }
SysTick->LOAD = (uint32_t)(ticks - 1UL);
NVIC_SetPriority(SysTick_IRQn, (1UL << __NVIC_PRIO_BITS) - 1UL);
SysTick->VAL = 0UL;
SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk |
SysTick_CTRL_TICKINT_Msk |
SysTick_CTRL_ENABLE_Msk;
return 0UL;
}
/* Memory barrier intrinsics — critical after register writes */
__DSB(); /* Data Synchronisation Barrier: ensures all memory
accesses complete before the next instruction */
__ISB(); /* Instruction Synchronisation Barrier: flushes the
pipeline, required after changing VTOR or enabling
the FPU via CPACR */
__DMB(); /* Data Memory Barrier: ordering only, no completion */
Why barriers matter: Cortex-M processors can reorder memory accesses for performance. After writing to a clock enable register (e.g., RCC->AHB1ENR), a __DSB() guarantees the peripheral has received the enable before any read-modify-write to its registers. Omitting this causes rare, hard-to-reproduce bugs — typically only visible at high clock speeds.
Register-Level Programming
CMSIS-Core gives you two layers of register access: the peripheral struct (generated from SVD files and included in device headers) and direct bit manipulation using the bitmask/position macros that accompany every struct field. The combination produces readable, portable, and compiler-optimisable code.
/*
* Three equivalent ways to set GPIOA pin 5 as output:
* The CMSIS struct approach is preferred for readability.
*/
/* Method 1: Raw address arithmetic (never do this in production) */
*((volatile uint32_t *)0x40020000) = (*((volatile uint32_t *)0x40020000)
& ~(0x3UL << 10)) | (0x1UL << 10);
/* Method 2: CMSIS struct + raw bit shift (acceptable) */
GPIOA->MODER = (GPIOA->MODER & ~(0x3UL << 10)) | (0x1UL << 10);
/* Method 3: CMSIS struct + named masks (preferred) */
GPIOA->MODER = (GPIOA->MODER & ~GPIO_MODER_MODER5_Msk)
| (GPIO_MODER_MODER5_0); /* Output mode = 0b01 */
/*
* CMSIS Peripheral Struct Layout — how it maps to hardware:
*
* The device header declares a struct whose fields are at fixed
* offsets matching the reference manual register layout.
* A macro then creates a pointer to the peripheral's base address.
*
* #define GPIOA ((GPIO_TypeDef *) GPIOA_BASE)
* GPIOA_BASE is 0x40020000 for STM32F4.
*
* So GPIOA->MODER is *(volatile uint32_t *)(0x40020000 + 0x00)
* And GPIOA->BSRR is *(volatile uint32_t *)(0x40020000 + 0x18)
*/
/* Reading a peripheral register safely */
uint32_t ahb1_enabled = RCC->AHB1ENR; /* single read — compiler
cannot cache this due to
__IO (volatile) qualifier */
/* Atomic set/clear using BSRR (Bit Set/Reset Register) */
GPIOA->BSRR = (1UL << 5); /* Set PA5 — upper 16 bits clear */
GPIOA->BSRR = (1UL << (5 + 16)); /* Reset PA5 — lower 16 bits clear */
NVIC: Nested Vectored Interrupt Controller
The NVIC is the heart of real-time responsiveness in Cortex-M systems. It manages up to 240 external interrupts plus 16 system exceptions, with hardware priority arbitration, tail-chaining, and late-arrival optimisation that deliver deterministic interrupt latency without software overhead.
Priority Grouping
The NVIC uses a single 8-bit priority register per interrupt, but only the top __NVIC_PRIO_BITS bits are implemented (typically 3–8 bits depending on the device). These bits are further split into preemption priority (high-order) and subpriority (low-order) by the priority grouping configuration in SCB->AIRCR.
| Group |
AIRCR PRIGROUP |
Preemption Bits |
Subpriority Bits |
Preemption Levels |
Subpriority Levels |
| Group 0 |
0b011 |
0 |
4 (of 4 implemented) |
1 (no preemption) |
16 |
| Group 1 |
0b100 |
1 |
3 |
2 |
8 |
| Group 2 |
0b101 |
2 |
2 |
4 |
4 |
| Group 3 |
0b110 |
3 |
1 |
8 |
2 |
| Group 4 |
0b111 |
4 |
0 |
16 |
1 (no subpriority) |
Priority Grouping Rule: FreeRTOS and most RTOSes require Group 4 (all bits as preemption, no subpriority). Setting any other grouping while using configMAX_SYSCALL_INTERRUPT_PRIORITY will cause undefined behaviour in critical sections. Always set priority grouping before initialising the RTOS.
/*
* NVIC Priority Grouping — complete configuration example
* Target: STM32F4 (Cortex-M4, 4 implemented priority bits = 16 levels)
*/
#include "stm32f407xx.h"
/* Set all 4 bits as preemption priority (Group 4 = FreeRTOS compatible) */
NVIC_SetPriorityGrouping(NVIC_PRIORITYGROUP_4);
/*
* NVIC_SetPriority / NVIC_GetPriority
* IRQn_Type is a signed enum: negative values are system exceptions,
* positive values are device interrupts.
*/
/* Configure USART1 interrupt at preemption priority 6 (mid-range) */
NVIC_SetPriority(USART1_IRQn, 6U);
/* Configure TIM2 interrupt at preemption priority 5 (higher than USART) */
NVIC_SetPriority(TIM2_IRQn, 5U);
/* Enable the interrupts */
NVIC_EnableIRQ(USART1_IRQn);
NVIC_EnableIRQ(TIM2_IRQn);
/* Verify configuration */
uint32_t usart_prio = NVIC_GetPriority(USART1_IRQn); /* returns 6 */
/* Disable an interrupt */
NVIC_DisableIRQ(USART1_IRQn);
/* Software-trigger an interrupt (useful for testing) */
NVIC_SetPendingIRQ(TIM2_IRQn);
/* Clear a pending interrupt (e.g., after spurious trigger) */
NVIC_ClearPendingIRQ(TIM2_IRQn);
/* Check if an interrupt is currently active (executing its ISR) */
uint32_t active = NVIC_GetActive(TIM2_IRQn);
Save/Restore IRQ State — Critical Sections
/*
* Portable critical section using CMSIS primask intrinsics.
* __get_PRIMASK() / __set_PRIMASK() work on all Cortex-M variants.
* This pattern is safe for nested critical sections.
*/
uint32_t enter_critical(void) {
uint32_t primask_state = __get_PRIMASK();
__disable_irq(); /* sets PRIMASK = 1, blocks all maskable IRQs */
__DSB();
__ISB();
return primask_state; /* caller saves the previous state */
}
void exit_critical(uint32_t primask_state) {
__set_PRIMASK(primask_state); /* restores, not unconditionally enables */
}
/* Usage */
void update_shared_buffer(uint8_t *data, uint32_t len) {
uint32_t state = enter_critical();
/* --- critical section: safe to modify shared data --- */
memcpy(g_shared_buffer, data, len);
g_shared_buffer_len = len;
/* ---------------------------------------------------- */
exit_critical(state);
}
/*
* On ARMv8-M with TrustZone you also have FAULTMASK for masking
* configurable faults, and BASEPRI for selective IRQ masking:
*/
__set_BASEPRI(configMAX_SYSCALL_INTERRUPT_PRIORITY << (8 - __NVIC_PRIO_BITS));
/* Only masks interrupts at priority >= configMAX_SYSCALL_INTERRUPT_PRIORITY.
* Interrupts at higher priority (lower number) still preempt.
* This is exactly what FreeRTOS taskENTER_CRITICAL() does internally. */
SysTick Timer
SysTick is a 24-bit down-counter built into every Cortex-M core. It counts from LOAD down to zero, triggers the SysTick exception (IRQ -1), reloads, and repeats. Because it is part of the processor — not a vendor peripheral — SysTick_Config() and SysTick_Handler are identical across every Cortex-M MCU.
Complete 1 ms Tick Generation
/*
* SysTick 1 ms time-base — complete implementation.
* SystemCoreClock is set by SystemInit() (called from startup code)
* and updated by SystemCoreClockUpdate() after any clock change.
*/
#include "stm32f407xx.h" /* or any Cortex-M device header */
/* Global tick counter — volatile because it is modified in an ISR */
static volatile uint32_t g_uwTick = 0U;
/**
* @brief Initialise SysTick for 1 ms interrupts.
* SysTick_Config() is a CMSIS-Core inline function.
* @retval 0 on success, 1 if ticks value is out of range.
*/
uint32_t HAL_InitTick(uint32_t TickPriority) {
/* Configure SysTick to generate interrupt every 1 ms */
if (SysTick_Config(SystemCoreClock / 1000U) != 0U) {
return 1U; /* Reload value exceeds 24-bit maximum */
}
/* Set SysTick interrupt priority (lowest by default) */
if (TickPriority < (1UL << __NVIC_PRIO_BITS)) {
NVIC_SetPriority(SysTick_IRQn, TickPriority);
}
return 0U;
}
/* SysTick exception handler — runs every 1 ms */
void SysTick_Handler(void) {
g_uwTick++;
/* If using FreeRTOS, also call xPortSysTickHandler() here */
}
/* Read current tick count */
uint32_t HAL_GetTick(void) {
return g_uwTick;
}
/* Suspend/resume tick for low-power modes */
void HAL_SuspendTick(void) {
SysTick->CTRL &= ~SysTick_CTRL_TICKINT_Msk;
}
void HAL_ResumeTick(void) {
SysTick->CTRL |= SysTick_CTRL_TICKINT_Msk;
}
Delay Implementations: Millisecond and Microsecond
/*
* Blocking millisecond delay using SysTick tick counter.
* Safe against uint32_t rollover (after ~49.7 days at 1 kHz).
*/
void delay_ms(uint32_t ms) {
uint32_t start = g_uwTick;
/* Subtraction is safe even if g_uwTick wraps around */
while ((g_uwTick - start) < ms) {
/* Optionally: __WFI() to enter sleep between ticks */
}
}
/*
* Microsecond delay using DWT (Data Watchpoint and Trace) cycle counter.
* DWT is available on Cortex-M3/M4/M7/M33 and above.
* Must enable DWT before use (done once at startup).
*/
void DWT_Init(void) {
/* Enable DWT by setting TRCENA in CoreDebug DEMCR */
CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
DWT->CYCCNT = 0U; /* Reset cycle counter */
DWT->CTRL |= DWT_CTRL_CYCCNTENA_Msk; /* Enable cycle counter */
}
void delay_us(uint32_t us) {
uint32_t cycles = us * (SystemCoreClock / 1000000U);
uint32_t start = DWT->CYCCNT;
while ((DWT->CYCCNT - start) < cycles) {}
}
/*
* Periodic task using absolute tick — avoids drift accumulation.
* Pattern used by FreeRTOS vTaskDelayUntil() and osDelayUntil().
*/
void periodic_sensor_task(void) {
uint32_t next_tick = g_uwTick;
for (;;) {
next_tick += 10U; /* 10 ms period */
sample_sensors();
/* Wait until next_tick — subtracts time already spent in task */
while ((int32_t)(g_uwTick - next_tick) < 0) {}
}
}
SysTick and RTOS: When using FreeRTOS, the RTOS hijacks SysTick_Handler. Do not call SysTick_Config() before vTaskStartScheduler() — the scheduler configures SysTick itself. In tickless idle mode, FreeRTOS may reprogram LOAD dynamically to sleep for multiple ticks at once.
Core Debug & Fault Registers
When your firmware faults, the Cortex-M processor saves state automatically onto the stack and vectors to a fault handler. The System Control Block (SCB) contains a set of status registers that precisely describe what went wrong — invaluable for diagnosing HardFaults, BusFaults, MemFaults, and UsageFaults in production firmware.
Reading CFSR, HFSR, and BFAR
/*
* SCB Fault Status Registers — definitions from core_cm4.h
*
* SCB->CFSR Configurable Fault Status Register (32-bit, 3 sub-registers)
* Bits 7:0 MMSR — MemManage Fault Status (byte)
* Bits 15:8 BFSR — BusFault Status (byte)
* Bits 31:16 UFSR — UsageFault Status (halfword)
*
* SCB->HFSR HardFault Status Register
* Bit 1 VECTTBL — fault on vector table read
* Bit 30 FORCED — escalated from configurable fault
* Bit 31 DEBUGEVT — debug event (reserved in non-debug use)
*
* SCB->BFAR BusFault Address Register — valid when BFSR.BFARVALID set
* SCB->MMFAR MemManage Fault Address Register — valid when MMSR.MMARVALID set
*/
typedef struct {
uint32_t cfsr; /* Configurable Fault Status Register */
uint32_t hfsr; /* HardFault Status Register */
uint32_t dfsr; /* Debug Fault Status Register */
uint32_t afsr; /* Auxiliary Fault Status Register */
uint32_t bfar; /* BusFault Address Register */
uint32_t mmfar; /* MemManage Fault Address Register */
uint32_t r0, r1, r2, r3, r12, lr, pc, xpsr; /* stacked registers */
} FaultInfo_t;
static volatile FaultInfo_t g_fault_info;
void capture_fault_registers(void) {
g_fault_info.cfsr = SCB->CFSR;
g_fault_info.hfsr = SCB->HFSR;
g_fault_info.dfsr = SCB->DFSR;
g_fault_info.afsr = SCB->AFSR;
g_fault_info.bfar = SCB->BFAR;
g_fault_info.mmfar = SCB->MMFAR;
/* Clear sticky fault bits by writing 1 to them */
SCB->CFSR = SCB->CFSR;
SCB->HFSR = SCB->HFSR;
}
/* Decode CFSR field meanings */
void decode_cfsr(uint32_t cfsr) {
/* MemManage faults (bits 7:0) */
if (cfsr & SCB_CFSR_IACCVIOL_Msk) { /* Instruction access violation */ }
if (cfsr & SCB_CFSR_DACCVIOL_Msk) { /* Data access violation */ }
if (cfsr & SCB_CFSR_MMARVALID_Msk){ /* SCB->MMFAR holds valid addr */ }
/* BusFault (bits 15:8) */
if (cfsr & SCB_CFSR_IBUSERR_Msk) { /* Instruction bus error */ }
if (cfsr & SCB_CFSR_PRECISERR_Msk){ /* Precise data bus error */ }
if (cfsr & SCB_CFSR_IMPRECISERR_Msk){ /* Imprecise bus error */ }
if (cfsr & SCB_CFSR_BFARVALID_Msk){ /* SCB->BFAR holds valid addr */ }
/* UsageFault (bits 31:16) */
if (cfsr & SCB_CFSR_UNDEFINSTR_Msk){ /* Undefined instruction */ }
if (cfsr & SCB_CFSR_INVSTATE_Msk) { /* Invalid state (e.g., THUMB) */ }
if (cfsr & SCB_CFSR_DIVBYZERO_Msk){ /* Divide by zero (if enabled) */ }
if (cfsr & SCB_CFSR_UNALIGNED_Msk){ /* Unaligned access (if enabled)*/ }
}
HardFault Handler with UART Logging
/*
* HardFault handler that extracts the stacked register frame
* and logs fault information over UART before halting.
*
* The __attribute__((naked)) prevents the compiler from generating
* a function prologue/epilogue that would clobber MSP/PSP.
*/
/* Assembly trampoline — extracts correct stack pointer */
__attribute__((naked)) void HardFault_Handler(void) {
__asm volatile (
"tst lr, #4 \n" /* Test EXC_RETURN bit 2 */
"ite eq \n" /* If EQ: came from MSP, else PSP */
"mrseq r0, msp \n"
"mrsne r0, psp \n"
"ldr r1, =HardFault_Handler_C \n"
"bx r1 \n"
);
}
/* C handler receives pointer to the stacked exception frame */
void HardFault_Handler_C(uint32_t *stacked_args) {
volatile uint32_t stacked_r0 = stacked_args[0];
volatile uint32_t stacked_r1 = stacked_args[1];
volatile uint32_t stacked_r2 = stacked_args[2];
volatile uint32_t stacked_r3 = stacked_args[3];
volatile uint32_t stacked_r12 = stacked_args[4];
volatile uint32_t stacked_lr = stacked_args[5];
volatile uint32_t stacked_pc = stacked_args[6];
volatile uint32_t stacked_xpsr= stacked_args[7];
/* Log to UART (assuming UART already initialised) */
uart_printf("[HARDFAULT] PC=0x%08X LR=0x%08X\r\n",
stacked_pc, stacked_lr);
uart_printf("[HARDFAULT] CFSR=0x%08X HFSR=0x%08X\r\n",
SCB->CFSR, SCB->HFSR);
if (SCB->CFSR & SCB_CFSR_BFARVALID_Msk) {
uart_printf("[HARDFAULT] BFAR=0x%08X (bus fault address)\r\n",
SCB->BFAR);
}
if (SCB->CFSR & SCB_CFSR_MMARVALID_Msk) {
uart_printf("[HARDFAULT] MMFAR=0x%08X (memmanage fault addr)\r\n",
SCB->MMFAR);
}
/* Store in retention RAM if available, then reset or halt */
for (;;) { __BKPT(0); } /* Break to debugger if attached */
}
Exercises
Exercise 1
Beginner
Enable/Disable an IRQ with Save & Restore
Implement a reentrant IRQSave(IRQn_Type irq) / IRQRestore(IRQn_Type irq, uint32_t saved) pair that saves whether the IRQ was enabled before disabling it, then correctly re-enables it only if it was enabled before. Test with nested calls to verify correctness. Use NVIC_GetEnableIRQ() to inspect state.
NVIC
Critical Sections
Reentrancy
Exercise 2
Intermediate
SysTick-Based Microsecond Delay
Implement a delay_us(uint32_t us) function using only the SysTick current value register (SysTick->VAL) and LOAD register — without using DWT. Account for the reload event (counter wrapping from 0 to LOAD). Validate accuracy by toggling a GPIO and measuring the period on an oscilloscope or logic analyser.
SysTick
Timing
Counter Wrap
Exercise 3
Advanced
HardFault Handler with UART Fault Report
Deliberately trigger three different fault types: (a) null pointer dereference (MemManage/BusFault), (b) unaligned access (UsageFault — enable via SCB->CCR |= SCB_CCR_UNALIGN_TRP_Msk), (c) divide-by-zero (enable via SCB->CCR |= SCB_CCR_DIV_0_TRP_Msk). Your HardFault handler must identify each type from CFSR and transmit the fault address (BFAR/MMFAR) over UART before halting.
HardFault
SCB->CFSR
Fault Analysis
UART Debug
NVIC Configuration Reference Generator
Use this tool to document your project's NVIC configuration — MCU, priority grouping, interrupt list with priorities. Download as Word, Excel, PDF, or PPTX for design review, team handover, or MISRA compliance documentation.
Conclusion & Next Steps
In this article we have thoroughly covered the processor-level CMSIS-Core API:
- The core_cmX.h header hierarchy — from core_cm0.h through core_cm85.h — gives you type-safe, zero-overhead access to every processor register via
__IO/__IM/__OM qualifiers and __STATIC_INLINE functions.
- Memory barrier intrinsics (
__DSB(), __ISB(), __DMB()) are not optional — they are required after clock enables and before critical sections to guarantee correctness across all Cortex-M pipeline configurations.
- The NVIC priority grouping must be configured before any interrupt is enabled — especially before starting an RTOS. Group 4 (all bits as preemption) is the safest choice for FreeRTOS-based projects.
- SysTick_Config() gives you a portable 1 ms time-base in a single call; the DWT cycle counter extends this to microsecond-accurate delays on M3/M4/M7/M33.
- The SCB fault registers (CFSR, HFSR, BFAR, MMFAR) encode exactly what went wrong — a proper HardFault handler reads and logs these before halting, transforming cryptic crashes into actionable diagnostics.
Next in the Series
In Part 3: Startup Code, Linker Scripts & Vector Table, we'll step back even further — to the very first instruction the CPU executes after reset. We'll trace the complete path from power-on through Reset_Handler, .data copy, BSS zero-fill, SystemInit(), and finally main(), and learn exactly how GCC linker scripts control every byte's placement in flash and SRAM.
Related Articles in This Series
Part 3: Startup Code, Linker Scripts & Vector Table
Understand the reset handler, BSS/data initialisation, vector table relocation, and the complete boot sequence from reset to main().
Read Article
Part 4: CMSIS-RTOS2 — Threads, Mutexes & Semaphores
Master the CMSIS-RTOS2 API for thread management, synchronisation primitives, and scheduling with FreeRTOS or Keil RTX5.
Read Article
Part 5: CMSIS-RTOS2 — Message Queues & Event Flags
Inter-thread communication patterns, ISR-to-thread signaling, and real-time design patterns using CMSIS-RTOS2 primitives.
Read Article