Back to Technology

CMSIS Part 10: Portable Firmware — Multi-Vendor CMSIS Projects

March 31, 2026 Wasil Zafar 26 min read

How to structure firmware so the same codebase targets multiple MCU families — the HAL-vs-CMSIS decision, conditional compilation patterns, and building board support packages that survive hardware revisions.

Table of Contents

  1. Portability Principles
  2. HAL vs CMSIS Trade-offs
  3. Cross-Platform Firmware Patterns
  4. Conditional Compilation
  5. Reusable BSP Design
  6. Exercises
  7. BSP Design Planner
  8. Conclusion & Next Steps
Series Context: This is Part 10 of 20 in the CMSIS Mastery Series. We have now covered the complete CMSIS toolkit — Core, RTOS2, DSP, drivers, packs, and debugging. Part 10 steps back to the architecture level: how do you design firmware that can run on MCUs from four different vendors without a complete rewrite? The answer lies in disciplined layering, strategic use of CMSIS, and a build system that makes platform selection explicit.

CMSIS Mastery Series

Your 20-step learning path • Currently on Step 10
1
Overview & ARM Cortex-M Ecosystem
CMSIS layers, Cortex-M families, memory map, toolchains
Completed
2
CMSIS-Core: Registers, NVIC & SysTick
core_cmX.h, register access, interrupt controller, SysTick timer
Completed
3
Startup Code, Linker Scripts & Vector Table
Reset handler, BSS init, scatter files, boot process
Completed
4
CMSIS-RTOS2: Threads, Mutexes & Semaphores
Thread management, synchronization primitives, scheduling
Completed
5
CMSIS-RTOS2: Message Queues & Event Flags
Inter-thread comms, ISR-to-thread, real-time design patterns
Completed
6
CMSIS-DSP: Filters, FFT & Math Functions
FIR/IIR filters, FFT, SIMD optimizations
Completed
7
CMSIS-Driver: UART, SPI & I2C
Driver abstraction layer, callbacks, DMA integration
Completed
8
CMSIS-Pack & Software Components
Pack files, device support, dependency management
Completed
9
Debugging with CMSIS-DAP & CoreSight
SWD/JTAG, HardFault analysis, ITM tracing
Completed
10
Portable Firmware: Multi-Vendor Projects
HAL vs CMSIS, cross-platform BSPs, reusable libraries
You Are Here
11
Interrupts, Concurrency & Real-Time Constraints
Interrupt latency, critical sections, lock-free programming
12
Memory Management in Embedded Systems
Static vs dynamic, heap fragmentation, memory pools
13
Low Power & Energy Optimization
Sleep modes, clock gating, tickless RTOS, power profiling
14
DMA & High-Performance Data Handling
DMA basics, peripheral transfers, zero-copy techniques
15
Security: ARMv8-M & TrustZone
Secure/non-secure worlds, secure boot, firmware protection
16
Bootloaders & Firmware Updates
OTA updates, dual-bank flash, fail-safe strategies
17
Testing & Validation
Unity/Ceedling unit tests, HIL testing, integration testing
18
Performance Optimization
Compiler flags, inline assembly, cache (M7/M33), profiling
19
Embedded Software Architecture
Layered design, event-driven, state machines, component-based
20
Tooling & Workflow (Professional Level)
CI/CD for embedded, MISRA, static analysis, Doxygen

Portability Principles

Embedded firmware portability is not an academic goal — it is a commercial reality. Hardware revisions happen. Component shortages force MCU substitutions. A new product line needs 80% of the same functionality on a different silicon family. Without a planned portability strategy, each of these events becomes a costly re-engineering exercise. With one, they become a CMake variable change and a targeted porting effort bounded to the BSP layer.

Why Firmware Portability Matters

The economics are straightforward. An application layer that directly calls HAL_GPIO_WritePin(GPIOA, GPIO_PIN_5, GPIO_PIN_SET) is welded to STM32's HAL and its register layout. Porting to NXP requires changing every call site. An application layer that calls BSP_LED_Set(LED_STATUS, true) is board-agnostic — porting requires only a new BSP implementation file. The interface is fixed; the implementation varies.

The Layered Firmware Model

Professional portable firmware follows a strict three-layer model. The Application Layer contains all business logic — state machines, algorithms, protocol handling — and calls only BSP and middleware APIs. It has zero knowledge of silicon. The BSP Layer (Board Support Package) wraps platform-specific peripheral calls behind a stable C interface. The Platform Layer contains CMSIS headers, startup code, and either direct register access or a thin vendor HAL. The Platform Layer changes with each MCU; the Application Layer never does.

HAL vs CMSIS Trade-offs

There is no universally correct answer to "should I use a vendor HAL or go CMSIS-only?" The right choice depends on your project's portability requirements, schedule constraints, performance budget, and maintainability needs. Understanding the trade-offs lets you make an informed decision rather than defaulting to whichever approach you learned first.

Portability Approach Comparison

Approach Implementation Effort Portability Code Size / Overhead Maintainability Best For
Vendor HAL only Low (SDK provides everything) Poor — tightly coupled to one vendor High — HALs include generous safety checks and abstraction overhead Good — vendor-maintained, well documented Rapid prototyping, single-platform products
CMSIS-Core only High (write all peripheral drivers) Excellent for processor-level APIs; poor for peripherals Minimal — direct register access, zero overhead Requires deep hardware knowledge; brittle to silicon revisions Safety-critical systems, bootloaders, ultra-low footprint
Custom HAL over CMSIS-Driver Medium (implement CMSIS-Driver interface) Excellent — application talks to stable CMSIS-Driver API Low — CMSIS-Driver is a thin callback-based interface Good — CMSIS-Driver interface is versioned and stable Products targeting multiple MCU families, middleware integration
C++ CRTP HAL Medium–High (template design) Excellent — zero-cost abstraction via static dispatch Zero runtime overhead (inlined at compile time) Requires C++ knowledge; excellent long-term maintainability Performance-critical portable libraries, modern embedded C++ projects

Designing a HAL Interface in C

A C hardware abstraction layer uses function pointers to achieve runtime polymorphism — the same technique used internally by CMSIS-Driver. The interface is a struct of function pointers; each platform provides a static instance that fills in the pointers with its platform-specific implementations. The application holds a pointer to the interface struct and calls through it without knowing which implementation is active.

/**
 * Portable GPIO HAL — function-pointer-based interface in C.
 * The application uses only hal_gpio_t; platform code fills the pointers.
 *
 * header: bsp/hal_gpio.h
 */
#ifndef HAL_GPIO_H
#define HAL_GPIO_H

#include <stdint.h>
#include <stdbool.h>

/** Opaque pin handle — platform defines what this actually is */
typedef struct hal_gpio_pin_t hal_gpio_pin_t;

/** GPIO direction */
typedef enum {
    HAL_GPIO_INPUT  = 0,
    HAL_GPIO_OUTPUT = 1
} hal_gpio_dir_t;

/** GPIO HAL interface — a vtable of function pointers */
typedef struct {
    /** Initialise a GPIO pin */
    void (*init)(hal_gpio_pin_t *pin, hal_gpio_dir_t dir);
    /** Set the output level (true = high, false = low) */
    void (*set)(hal_gpio_pin_t *pin, bool level);
    /** Toggle the output level */
    void (*toggle)(hal_gpio_pin_t *pin);
    /** Read the current input level */
    bool (*read)(const hal_gpio_pin_t *pin);
} hal_gpio_t;

#endif /* HAL_GPIO_H */

/* ─────────────────────────────────────────────────────────────────────────── */

/**
 * STM32F4 implementation of hal_gpio_t
 * file: bsp/stm32f4/bsp_gpio_stm32f4.c
 */
#include "hal_gpio.h"
#include "stm32f4xx.h"   /* CMSIS device header */

/* Concrete pin definition for STM32F4 */
typedef struct hal_gpio_pin_t {
    GPIO_TypeDef *port;
    uint32_t      pin_mask;   /* e.g. GPIO_BSRR_BS5 for PA5 */
    uint32_t      pin_pos;    /* bit position                 */
} hal_gpio_pin_t;

static void stm32_gpio_init(hal_gpio_pin_t *pin, hal_gpio_dir_t dir) {
    /* Enable AHB1 clock for the GPIO port — port address encodes the index */
    uint32_t port_idx = ((uint32_t)pin->port - GPIOA_BASE) / 0x400UL;
    RCC->AHB1ENR |= (1UL << port_idx);
    __DSB();
    /* Set MODER bits: 0b01 = output, 0b00 = input */
    pin->port->MODER &= ~(0x3UL << (pin->pin_pos * 2U));
    if (dir == HAL_GPIO_OUTPUT) {
        pin->port->MODER |= (0x1UL << (pin->pin_pos * 2U));
    }
}

static void stm32_gpio_set(hal_gpio_pin_t *pin, bool level) {
    /* BSRR: lower 16 bits = set, upper 16 bits = reset */
    if (level) {
        pin->port->BSRR = (1UL << pin->pin_pos);          /* Set   */
    } else {
        pin->port->BSRR = (1UL << (pin->pin_pos + 16U));  /* Reset */
    }
}

static void stm32_gpio_toggle(hal_gpio_pin_t *pin) {
    pin->port->ODR ^= (1UL << pin->pin_pos);
}

static bool stm32_gpio_read(const hal_gpio_pin_t *pin) {
    return (pin->port->IDR & (1UL << pin->pin_pos)) != 0U;
}

/* Static vtable instance — the BSP exports this */
const hal_gpio_t HAL_GPIO_STM32F4 = {
    .init   = stm32_gpio_init,
    .set    = stm32_gpio_set,
    .toggle = stm32_gpio_toggle,
    .read   = stm32_gpio_read,
};

/* ─────────────────────────────────────────────────────────────────────────── */

/**
 * Application code — zero vendor dependency.
 * Works on any platform that provides a hal_gpio_t implementation.
 * file: app/main.c
 */
#include "hal_gpio.h"
#include "cmsis_os2.h"  /* CMSIS-RTOS2 — also portable */

/* Forward declarations — defined in the active BSP */
extern const hal_gpio_t     HAL_GPIO_IMPL;
extern       hal_gpio_pin_t BSP_LED_STATUS_PIN;

void led_blink_task(void *arg) {
    (void)arg;
    HAL_GPIO_IMPL.init(&BSP_LED_STATUS_PIN, HAL_GPIO_OUTPUT);
    for (;;) {
        HAL_GPIO_IMPL.toggle(&BSP_LED_STATUS_PIN);
        osDelay(500U);  /* 500 ms — CMSIS-RTOS2 API, also portable */
    }
}

Cross-Platform Firmware Patterns

The key to successful cross-platform firmware is identifying exactly which lines of code change between targets and isolating them behind a stable interface. In practice, the delta between MCU families affects only four areas: the startup file and linker script (handled by CMSIS-Pack); the clock initialisation sequence; the peripheral register initialisation; and the pin-to-peripheral mapping. Everything above the BSP layer — RTOS task logic, communication protocols, application state machines — is genuinely portable.

Abstract GPIO Toggle on Two MCU Families

/**
 * Same application code running LED blink on STM32F4 AND NXP LPC55S69
 * via the abstract BSP interface. The only file that changes between
 * targets is bsp_impl.c (linked differently per board).
 *
 * BSP interface header: bsp/bsp_led.h  (shared, board-independent)
 */

/* ── bsp/bsp_led.h ──────────────────────────────────────────── */
#ifndef BSP_LED_H
#define BSP_LED_H
#include <stdbool.h>

typedef enum { LED_GREEN = 0, LED_RED, LED_BLUE, BSP_LED_COUNT } bsp_led_id_t;

void BSP_LED_Init(void);
void BSP_LED_Set(bsp_led_id_t led, bool on);
void BSP_LED_Toggle(bsp_led_id_t led);
#endif

/* ── app/main.c — IDENTICAL for both targets ────────────────── */
#include "bsp_led.h"
#include "cmsis_os2.h"

void app_init(void) { BSP_LED_Init(); }

void app_led_task(void *arg) {
    (void)arg;
    for (;;) {
        BSP_LED_Toggle(LED_GREEN);
        osDelay(500U);
        BSP_LED_Toggle(LED_RED);
        osDelay(250U);
    }
}

/* ── bsp/stm32f4/bsp_led_stm32f4.c — STM32F4 implementation ─── */
#ifdef BOARD_STM32F407_DISCO
#include "bsp_led.h"
#include "stm32f407xx.h"

/* PA5 = Green LED, PD14 = Red LED, PD15 = Blue LED on Discovery */
static const struct { GPIO_TypeDef *port; uint8_t pin; } leds[BSP_LED_COUNT] = {
    [LED_GREEN] = { GPIOA, 5  },
    [LED_RED]   = { GPIOD, 14 },
    [LED_BLUE]  = { GPIOD, 15 },
};

void BSP_LED_Init(void) {
    RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN | RCC_AHB1ENR_GPIODEN;
    __DSB();
    for (int i = 0; i < BSP_LED_COUNT; i++) {
        leds[i].port->MODER &= ~(3UL << (leds[i].pin * 2U));
        leds[i].port->MODER |=  (1UL << (leds[i].pin * 2U));  /* Output */
    }
}
void BSP_LED_Set(bsp_led_id_t led, bool on) {
    if (on) leds[led].port->BSRR = (1UL << leds[led].pin);
    else    leds[led].port->BSRR = (1UL << (leds[led].pin + 16U));
}
void BSP_LED_Toggle(bsp_led_id_t led) {
    leds[led].port->ODR ^= (1UL << leds[led].pin);
}
#endif /* BOARD_STM32F407_DISCO */

/* ── bsp/lpc55s69/bsp_led_lpc55s69.c — NXP LPC55 implementation */
#ifdef BOARD_NXP_LPC55S69_EVK
#include "bsp_led.h"
#include "LPC55S69_cm33_core0.h"  /* NXP CMSIS device header */

/* LPC55S69-EVK: PIO1_4 = Blue, PIO1_6 = Red, PIO1_7 = Green */
static const struct { uint8_t port; uint8_t pin; } leds[BSP_LED_COUNT] = {
    [LED_GREEN] = { 1, 7 },
    [LED_RED]   = { 1, 6 },
    [LED_BLUE]  = { 1, 4 },
};

void BSP_LED_Init(void) {
    /* Enable IOCON and GPIO clocks via SYSCON — NXP-specific */
    SYSCON->AHBCLKCTRL0 |= SYSCON_AHBCLKCTRL0_IOCON_MASK
                         |  SYSCON_AHBCLKCTRL0_GPIO1_MASK;
    for (int i = 0; i < BSP_LED_COUNT; i++) {
        GPIO->DIR[leds[i].port] |= (1UL << leds[i].pin);  /* Output */
        GPIO->SET[leds[i].port]  = (1UL << leds[i].pin);  /* LED off (active low) */
    }
}
void BSP_LED_Set(bsp_led_id_t led, bool on) {
    /* LPC55 EVK LEDs are active-low */
    if (on) GPIO->CLR[leds[led].port] = (1UL << leds[led].pin);
    else    GPIO->SET[leds[led].port] = (1UL << leds[led].pin);
}
void BSP_LED_Toggle(bsp_led_id_t led) {
    GPIO->NOT[leds[led].port] = (1UL << leds[led].pin);
}
#endif /* BOARD_NXP_LPC55S69_EVK */
Key Observation: The two BSP implementation files use completely different register names (GPIOA->BSRR vs GPIO->CLR[port]), different clock enable mechanisms, and even different LED polarity (active high vs active low). Yet the application code calling BSP_LED_Toggle(LED_GREEN) is identical on both platforms. This is the portable firmware goal achieved.

MCU Family Porting Delta

Aspect STM32F4 (Cortex-M4) NXP LPC55S69 (Cortex-M33) Nordic nRF52840 (Cortex-M4) Renesas RA4M1 (Cortex-M4)
Startup file startup_stm32f407xx.s — from Keil DFP startup_LPC55S69.s — from NXP DFP gcc_startup_nrf52840.S — from Nordic SDK startup_ra4m1.s — from Renesas FSP
Linker script STM32F407 has 1 MB flash, 192 KB SRAM + 64 KB CCM LPC55 has 640 KB flash, 320 KB SRAM (TrustZone split) nRF52840: 1 MB flash with 256 KB SoftDevice reservation RA4M1: 256 KB flash, 32 KB SRAM; FSP-specific regions
Clock init SystemInit() + RCC PLL configuration sequence SYSCON PLL setup; optional FRO/FRO_HF selection nRF clock controller; SoftDevice manages HFCLK if present CGC module; FSP R_CGC_Open() or direct register write
Peripheral base addresses AHB1/AHB2/APB1/APB2 bus topology; GPIOA at 0x40020000 AHB bus; GPIO base at 0x4008C000; port-indexed arrays NRF_GPIO_P0 at 0x50000000; P1 at 0x50000300 IOPORT module; register structs differ from STM32 conventions
GPIO model MODER/OTYPER/OSPEEDR/PUPDR — 4 separate registers per port IOCON pin mux + GPIO DIR/SET/CLR/NOT arrays PIN_CNF[] per-pin configuration, OUT/IN registers PmnPFS register per pin; dedicated PSEL bits for alternate function
NVIC priority bits 4 bits (__NVIC_PRIO_BITS = 4) 3 bits (M33 base) — check device header 3 bits (__NVIC_PRIO_BITS = 3) 4 bits (__NVIC_PRIO_BITS = 4)

Conditional Compilation

When the BSP layer cannot fully isolate a platform dependency — for example, when a performance-critical path needs direct register access on each platform — conditional compilation with preprocessor defines provides a clean escape hatch. The key discipline is: keep conditionals in the BSP layer, never in the application layer.

#ifdef Board Patterns

/**
 * BSP clock initialisation — conditional compilation pattern.
 * Board defines are passed via CMake target_compile_definitions.
 * Never use these #ifdefs in application code.
 *
 * file: bsp/bsp_clocks.c
 */

#include "bsp_clocks.h"

#if defined(BOARD_STM32F407_DISCO)
    #include "stm32f407xx.h"
    #include "system_stm32f4xx.h"

    void BSP_Clock_Init(void) {
        /* STM32F407 Discovery: configure PLL for 168 MHz from 8 MHz HSE */
        /* Use CMSIS SystemInit() as starting point, then fine-tune */
        SystemInit();
        /* Enable HSE */
        RCC->CR |= RCC_CR_HSEON;
        while (!(RCC->CR & RCC_CR_HSERDY)) {}
        /* Configure PLL: VCO = 8 MHz * (336/8) = 336 MHz; SYSCLK = 336/2 = 168 MHz */
        RCC->PLLCFGR = (8U  << RCC_PLLCFGR_PLLM_Pos)   /* /8  */
                     | (336U << RCC_PLLCFGR_PLLN_Pos)   /* x336 */
                     | (0U  << RCC_PLLCFGR_PLLP_Pos)    /* /2   */
                     | RCC_PLLCFGR_PLLSRC_HSE;
        RCC->CR |= RCC_CR_PLLON;
        while (!(RCC->CR & RCC_CR_PLLRDY)) {}
        /* Set flash latency for 168 MHz (5 wait states) */
        FLASH->ACR = FLASH_ACR_LATENCY_5WS | FLASH_ACR_PRFTEN | FLASH_ACR_DCEN | FLASH_ACR_ICEN;
        /* Switch to PLL */
        RCC->CFGR |= RCC_CFGR_SW_PLL;
        while ((RCC->CFGR & RCC_CFGR_SWS_Msk) != RCC_CFGR_SWS_PLL) {}
        SystemCoreClock = 168000000U;
    }

#elif defined(BOARD_NXP_LPC55S69_EVK)
    #include "LPC55S69_cm33_core0.h"

    void BSP_Clock_Init(void) {
        /* LPC55S69 EVK: switch to FRO 96 MHz (internal oscillator) */
        /* SYSCON MAINCLKSELA: select FRO_HF (96 MHz) */
        SYSCON->MAINCLKSELA = 3U;   /* FRO 96 MHz output */
        SYSCON->MAINCLKSELB = 0U;   /* MAINCLKSELA directly */
        SystemCoreClock = 96000000U;
    }

#elif defined(BOARD_NRF52840_DK)
    #include "nrf52840.h"

    void BSP_Clock_Init(void) {
        /* nRF52840: start HFXO (external 32 MHz crystal) */
        NRF_CLOCK->EVENTS_HFCLKSTARTED = 0U;
        NRF_CLOCK->TASKS_HFCLKSTART    = 1U;
        while (!NRF_CLOCK->EVENTS_HFCLKSTARTED) {}
        SystemCoreClock = 64000000U;  /* CPU runs at 64 MHz (PLL from 32 MHz HFXO) */
    }

#else
    #error "Unknown board — define BOARD_STM32F407_DISCO, BOARD_NXP_LPC55S69_EVK, or BOARD_NRF52840_DK"
#endif

CMake Board Selection via -DBOARD=

# CMakeLists.txt — board-aware firmware build
cmake_minimum_required(VERSION 3.21)
project(portable_firmware C ASM)

# ── Board selection ─────────────────────────────────────────────────────────
# Usage: cmake -DBOARD=stm32f407_disco ..
#        cmake -DBOARD=lpc55s69_evk ..
#        cmake -DBOARD=nrf52840_dk ..

if(NOT DEFINED BOARD)
    message(FATAL_ERROR "BOARD is not set. Use -DBOARD=")
endif()

# ── Per-board configuration ─────────────────────────────────────────────────
if(BOARD STREQUAL "stm32f407_disco")
    set(CPU_FLAGS "-mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=hard")
    set(LINKER_SCRIPT "${CMAKE_SOURCE_DIR}/bsp/stm32f4/STM32F407VGTx_FLASH.ld")
    set(BSP_SOURCES
        bsp/stm32f4/bsp_clocks.c
        bsp/stm32f4/bsp_led_stm32f4.c
        bsp/stm32f4/bsp_uart_stm32f4.c
    )
    set(BSP_DEFINES BOARD_STM32F407_DISCO STM32F407xx)
    set(BSP_INCLUDES bsp/stm32f4 ${CMSIS_PACK_ROOT}/Keil/STM32F4xx_DFP/2.17.1/Device/Include)

elseif(BOARD STREQUAL "lpc55s69_evk")
    set(CPU_FLAGS "-mcpu=cortex-m33 -mthumb -mfpu=fpv5-sp-d16 -mfloat-abi=hard")
    set(LINKER_SCRIPT "${CMAKE_SOURCE_DIR}/bsp/lpc55s69/LPC55S69_flash.ld")
    set(BSP_SOURCES
        bsp/lpc55s69/bsp_clocks.c
        bsp/lpc55s69/bsp_led_lpc55s69.c
        bsp/lpc55s69/bsp_uart_lpc55s69.c
    )
    set(BSP_DEFINES BOARD_NXP_LPC55S69_EVK CPU_LPC55S69JBD100)
    set(BSP_INCLUDES bsp/lpc55s69 ${CMSIS_PACK_ROOT}/NXP/LPC55S6x_DFP/18.0.0/device/include)

elseif(BOARD STREQUAL "nrf52840_dk")
    set(CPU_FLAGS "-mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=hard")
    set(LINKER_SCRIPT "${CMAKE_SOURCE_DIR}/bsp/nrf52840/nrf52840_xxaa.ld")
    set(BSP_SOURCES
        bsp/nrf52840/bsp_clocks.c
        bsp/nrf52840/bsp_led_nrf52840.c
        bsp/nrf52840/bsp_uart_nrf52840.c
    )
    set(BSP_DEFINES BOARD_NRF52840_DK NRF52840_XXAA)
    set(BSP_INCLUDES bsp/nrf52840 ${CMSIS_PACK_ROOT}/NordicSemiconductor/nRF_DeviceFamilyPack/8.0.0/Device/Include)
else()
    message(FATAL_ERROR "Unknown BOARD: ${BOARD}")
endif()

# ── Common sources (application + CMSIS) ────────────────────────────────────
add_executable(firmware.elf
    app/main.c
    app/app_tasks.c
    ${BSP_SOURCES}
)

target_include_directories(firmware.elf PRIVATE
    app/
    bsp/
    ${BSP_INCLUDES}
    ${CMSIS_PACK_ROOT}/ARM/CMSIS/6.1.0/CMSIS/Core/Include
)

target_compile_definitions(firmware.elf PRIVATE ${BSP_DEFINES})

separate_arguments(CPU_FLAGS_LIST NATIVE_COMMAND "${CPU_FLAGS}")
target_compile_options(firmware.elf PRIVATE
    ${CPU_FLAGS_LIST}
    -O2 -Wall -Wextra -ffunction-sections -fdata-sections
)

target_link_options(firmware.elf PRIVATE
    ${CPU_FLAGS_LIST}
    -T${LINKER_SCRIPT}
    -Wl,--gc-sections -Wl,-Map=firmware.map
)

# Build example:
# mkdir build-stm32 && cd build-stm32
# cmake -G Ninja -DCMAKE_TOOLCHAIN_FILE=../arm-none-eabi.cmake -DBOARD=stm32f407_disco ..
# ninja

Reusable BSP Design

A well-designed BSP is more than a collection of platform-specific .c files. It is a structured library with a versioned public API, a clear file organisation, and a documented porting guide. When designed correctly, adding a new board variant requires only adding a new BSP directory and selecting it in CMake — the application and middleware layers remain untouched.

BSP Directory Structure

firmware/
├── app/                          # Application layer — BOARD-INDEPENDENT
│   ├── main.c                    # Entry point, RTOS init
│   ├── app_sensor.c              # Sensor acquisition logic
│   └── app_comms.c               # Communication protocol handler
├── bsp/
│   ├── bsp_gpio.h                # PUBLIC: Abstract GPIO interface
│   ├── bsp_uart.h                # PUBLIC: Abstract UART interface
│   ├── bsp_i2c.h                 # PUBLIC: Abstract I2C interface
│   ├── bsp_led.h                 # PUBLIC: Abstract LED interface
│   ├── stm32f4/                  # STM32F4 BSP implementation
│   │   ├── bsp_gpio_stm32f4.c
│   │   ├── bsp_uart_stm32f4.c
│   │   ├── bsp_clocks.c
│   │   ├── STM32F407VGTx_FLASH.ld
│   │   └── bsp_board_config.h    # Pin mappings for specific board revision
│   ├── lpc55s69/                 # NXP LPC55S69 BSP implementation
│   │   ├── bsp_gpio_lpc55s69.c
│   │   ├── bsp_uart_lpc55s69.c
│   │   ├── bsp_clocks.c
│   │   └── LPC55S69_flash.ld
│   └── nrf52840/                 # Nordic nRF52840 BSP implementation
│       ├── bsp_gpio_nrf52840.c
│       └── nrf52840_xxaa.ld
├── drivers/                      # Reusable device drivers — BOARD-INDEPENDENT
│   ├── bme280/                   # BME280 temperature/humidity sensor
│   │   ├── bme280.h
│   │   └── bme280.c              # Uses bsp_i2c.h — works on any platform
│   └── ssd1306/                  # OLED display driver
│       ├── ssd1306.h
│       └── ssd1306.c             # Uses bsp_i2c.h
└── CMakeLists.txt                # Board selection via -DBOARD=

Sensor Driver Portability via BSP I2C Abstraction

The real payoff of the BSP approach is that peripheral device drivers (sensor ICs, display controllers, wireless modules) become completely board-independent. A BME280 temperature sensor driver that depends on bsp_i2c.h will work on any board that provides a conforming I2C BSP implementation — whether that implementation uses CMSIS-Driver underneath, or a vendor HAL, or direct register access.

/**
 * Portable BME280 driver — uses bsp_i2c.h, not any vendor API.
 * file: drivers/bme280/bme280.c
 */
#include "bme280.h"
#include "bsp_i2c.h"   /* Abstract I2C: BSP_I2C_Write(), BSP_I2C_Read() */

#define BME280_ADDR     0x76U  /* Default I2C address (SDO = GND) */
#define BME280_REG_ID   0xD0U
#define BME280_REG_CTRL 0xF4U

static void bme280_write_reg(uint8_t reg, uint8_t value) {
    uint8_t buf[2] = { reg, value };
    BSP_I2C_Write(BSP_I2C_BUS0, BME280_ADDR, buf, 2U);
}

static uint8_t bme280_read_reg(uint8_t reg) {
    uint8_t val = 0U;
    BSP_I2C_Write(BSP_I2C_BUS0, BME280_ADDR, &reg, 1U);
    BSP_I2C_Read(BSP_I2C_BUS0, BME280_ADDR, &val, 1U);
    return val;
}

bool BME280_Init(void) {
    uint8_t chip_id = bme280_read_reg(BME280_REG_ID);
    if (chip_id != 0x60U) return false;  /* Not a BME280 */
    /* Force mode, osrs_t=x1, osrs_p=x1 */
    bme280_write_reg(BME280_REG_CTRL, 0x27U);
    return true;
}
Common Pitfall: Portability breaks down when timing-sensitive code uses SystemCoreClock or cycle counts without going through the BSP. Always provide a BSP_Delay_us() function that uses DWT or SysTick correctly for each platform's clock frequency. Direct for-loop delays are the enemy of portability.

Exercises

Exercise 1 Beginner

Abstract GPIO Toggle and Implement for Two Vendors

Define a hal_gpio_t interface struct (as shown in this article) with init, set, toggle, and read function pointers. Implement the interface for two target MCUs you have access to — for example, STM32 and nRF52 (or simulate the second target with a mock that logs calls via printf). Write an application-layer blink loop that calls only gpio->toggle() and osDelay(). Verify: (a) switching between the two implementations requires no change to the application file, (b) both compile cleanly with arm-none-eabi-gcc -Wall -Wextra without warnings, (c) the sizeof the vtable struct is exactly 4 × pointer size bytes.

HAL Interface Function Pointers Portability
Exercise 2 Intermediate

CMake Project Selecting the Correct BSP via -DBOARD=

Create a CMake project with the structure from this article: an app/ directory containing a single main.c that calls only BSP functions, a bsp/ directory with board-specific subdirectories, and a CMakeLists.txt that selects the correct BSP sources, defines, and linker script based on a -DBOARD= variable. Implement at minimum two boards. Demonstrate that: (a) cmake -DBOARD=board_a .. followed by ninja produces a valid ELF for board A, (b) cmake -DBOARD=board_b .. produces a valid ELF for board B, (c) cmake without -DBOARD produces a clear FATAL_ERROR message. Submit your CMakeLists.txt and the shared app/main.c file as evidence of portability.

CMake Board Selection BSP Structure
Exercise 3 Advanced

Portable Sensor Driver on Two I2C Implementations

Implement the BSP_I2C_Write() and BSP_I2C_Read() functions from bsp_i2c.h for two different I2C backends: (a) using CMSIS-Driver I2C (ARM_DRIVER_I2C interface from Driver_I2C.h), (b) using a vendor HAL I2C API (e.g., STM32 HAL_I2C_Master_Transmit or nRF nrfx_twim_xfer). Write a BME280 or SHT31 temperature sensor driver that calls only bsp_i2c.h functions. Prove portability by: reading the sensor chip ID register successfully on both backends, confirming the same sensor driver source file is used unchanged for both, and documenting which files changed between backends and which remained constant.

CMSIS-Driver I2C Sensor Drivers BSP Abstraction

BSP Design Planner

Use this tool to document your portable firmware architecture — the primary and secondary MCU targets, abstracted peripherals, HAL layers, CMake board defines, and porting notes. Generate a Word, Excel, PDF, or PPTX document for architecture reviews, team knowledge transfer, or project proposals.

BSP & Portable Firmware Design Planner

Document your multi-vendor firmware architecture. Download as Word, Excel, PDF, or PPTX.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Conclusion & Next Steps

Portable firmware is not a luxury — it is a professional engineering standard that pays dividends throughout a product's lifetime. The key architectural principles from this article:

  • The three-layer model (Application / BSP / Platform) creates a clean boundary: everything above BSP is board-agnostic, everything below is intentionally platform-specific. Never let silicon-specific code leak upward across this boundary.
  • Function-pointer HAL interfaces in C achieve runtime portability with minimal overhead — the same mechanism CMSIS-Driver uses internally. The vtable struct pattern is the canonical C approach to interface-based design.
  • CMake board variables (-DBOARD=) make platform selection explicit, reproducible, and CI/CD-friendly. Every team member and every pipeline build starts from the same clean state by specifying a single variable.
  • The porting delta between Cortex-M MCU families is smaller than it appears: CMSIS provides a stable processor-level API, and the actual platform-specific code is confined to clock initialisation, GPIO model differences, and linker script memory regions.
  • Sensor and middleware drivers written against the BSP interface are the ultimate test of portability: a BME280 driver that works on STM32, NXP, Nordic, and Renesas without modification demonstrates that the BSP abstraction is correctly designed.

Next in the Series

In Part 11: Interrupts, Concurrency & Real-Time Constraints, we go deep on the dynamics that make embedded systems hard: interrupt latency budgets, priority inversion, critical sections, lock-free data structures for ISR-to-thread communication, and the RTOS timing guarantees you can and cannot rely on. The debugging skills from Part 9 and the portable architecture from Part 10 come together here to build reliable concurrent systems.

Technology