Back to Sensors & Actuators Series

Part 9: Advanced Topics

July 14, 2025 Wasil Zafar 50 min read

Sensor fusion algorithms, Kalman filter implementation, TinyML on microcontrollers, ultra-low power strategies, and building fault-tolerant embedded systems.

Table of Contents

  1. Sensor Fusion
  2. TinyML
  3. Power Optimization
  4. Fault Tolerance
  5. Conclusion & Next Steps

Sensor Fusion

Sensor fusion combines data from multiple sensors to produce a more accurate, reliable, or complete measurement than any single sensor can provide. The classic example is IMU fusion: accelerometers provide accurate orientation under static conditions but are noisy with vibration, while gyroscopes provide smooth angular rate but drift over time.

Sensor Fusion Concept
flowchart LR
    ACC["๐Ÿ“ Accelerometer
Accurate but noisy"] --> FUSE GYRO["๐Ÿ”„ Gyroscope
Smooth but drifts"] --> FUSE MAG["๐Ÿงญ Magnetometer
Heading reference"] --> FUSE FUSE["๐Ÿง  Fusion
Algorithm"] --> OUT["โœ… Fused Estimate
Accurate + Stable"] style FUSE fill:#3B9797,stroke:#3B9797,color:#fff style OUT fill:#132440,stroke:#132440,color:#fff style ACC fill:#e8f4f4,stroke:#3B9797,color:#132440 style GYRO fill:#f0f4f8,stroke:#16476A,color:#132440 style MAG fill:#e8f4f4,stroke:#3B9797,color:#132440

Complementary Filter

The simplest fusion algorithm. It combines accelerometer (low-pass) and gyroscope (high-pass) data with a tuning parameter $\alpha$ (typically 0.96–0.98):

$$\theta_{fused} = \alpha \cdot (\theta_{prev} + \omega_{gyro} \cdot \Delta t) + (1 - \alpha) \cdot \theta_{accel}$$

// Complementary filter for pitch/roll from IMU
#include <math.h>

#define ALPHA 0.98f
#define RAD_TO_DEG 57.2957795f

typedef struct {
    float pitch;   // degrees
    float roll;    // degrees
} Orientation;

static Orientation orient = {0, 0};

Orientation complementary_filter(float ax, float ay, float az,
                                  float gx, float gy, float gz,
                                  float dt) {
    // Gyroscope integration (high-pass: captures fast changes)
    orient.pitch += gx * dt;
    orient.roll  += gy * dt;

    // Accelerometer angle (low-pass: captures gravity direction)
    float accel_pitch = atan2f(ay, sqrtf(ax * ax + az * az)) * RAD_TO_DEG;
    float accel_roll  = atan2f(-ax, az) * RAD_TO_DEG;

    // Fuse: trust gyro for fast changes, accel for steady-state
    orient.pitch = ALPHA * orient.pitch + (1.0f - ALPHA) * accel_pitch;
    orient.roll  = ALPHA * orient.roll  + (1.0f - ALPHA) * accel_roll;

    return orient;
}

Kalman Filter

The Kalman filter is the optimal estimator for linear systems with Gaussian noise. It operates in two phases: predict (project state forward using the model) and update (correct with new measurement).

$$\hat{x}_{k|k-1} = F \hat{x}_{k-1|k-1} + B u_k \quad \text{(Predict)}$$ $$P_{k|k-1} = F P_{k-1|k-1} F^T + Q$$ $$K_k = P_{k|k-1} H^T (H P_{k|k-1} H^T + R)^{-1} \quad \text{(Kalman Gain)}$$ $$\hat{x}_{k|k} = \hat{x}_{k|k-1} + K_k (z_k - H \hat{x}_{k|k-1}) \quad \text{(Update)}$$ $$P_{k|k} = (I - K_k H) P_{k|k-1}$$

// 1D Kalman filter for sensor smoothing
typedef struct {
    float x;     // State estimate
    float p;     // Estimate covariance
    float q;     // Process noise
    float r;     // Measurement noise
    float k;     // Kalman gain
} Kalman1D;

void kalman_init(Kalman1D *kf, float initial, float p, float q, float r) {
    kf->x = initial;
    kf->p = p;
    kf->q = q;
    kf->r = r;
}

float kalman_update(Kalman1D *kf, float measurement) {
    // Predict
    kf->p += kf->q;

    // Update
    kf->k = kf->p / (kf->p + kf->r);
    kf->x = kf->x + kf->k * (measurement - kf->x);
    kf->p = (1.0f - kf->k) * kf->p;

    return kf->x;
}

Madgwick / Mahony Filters

IMU Orientation Filters:
  • Madgwick: Gradient descent optimization. Single tuning parameter (beta). Handles magnetic distortion. 40–100 µs on Cortex-M4
  • Mahony: Complementary filter with PI correction. Two tuning parameters (Kp, Ki). Lighter than Madgwick. Popular in flight controllers
  • Extended Kalman Filter (EKF): Linearizes nonlinear system. Most accurate but heaviest. Used in high-end INS/GPS fusion
  • Unscented Kalman Filter (UKF): Uses sigma points instead of linearization. Better for highly nonlinear systems

TinyML

TensorFlow Lite for Microcontrollers

TinyML brings machine learning inference to microcontrollers with as little as 16 KB of RAM. Common applications include keyword spotting, gesture recognition, anomaly detection, and predictive maintenance — all running locally without cloud connectivity.

TinyML Platform Comparison

FrameworkMin RAMMCU SupportFeatures
TF Lite Micro16 KBCortex-M0+ to M7Quantized models, CMSIS-NN acceleration
Edge ImpulseVariesArduino, STM32, NordicEnd-to-end pipeline, data collection
STM32Cube.AIVariesSTM32 onlyDirect TF/ONNX import, optimized for STM32
microTVM (Apache TVM)VariesMultipleCompiler-based optimization

ML Workflow for Embedded

TinyML Development Pipeline:
  1. Collect: Gather sensor data from target hardware (accelerometer, microphone, etc.)
  2. Train: Train model in Python (TensorFlow/Keras) on PC or cloud
  3. Quantize: Convert float32 weights to int8 (4x size reduction, 2-4x speed improvement)
  4. Convert: Export as .tflite FlatBuffer model file
  5. Deploy: Include as C array in firmware, run inference via TF Lite Micro interpreter
  6. Validate: Compare on-device accuracy vs PC baseline
TinyML Development Pipeline
flowchart LR
    subgraph PC["โ˜๏ธ PC / Cloud"]
        A["๐Ÿ“Š Collect
Training Data"] --> B["๐Ÿง  Train
Model"] B --> C["๐Ÿ“ฆ Quantize
INT8/Float16"] C --> D["๐Ÿ”„ Convert
TF Lite Micro"] end subgraph MCU["๐Ÿ”Œ Microcontroller"] E["โฌ‡๏ธ Deploy
C Array in Flash"] F["โœ… Validate
On-device Accuracy"] end D --> E --> F F -.->|"Accuracy OK?"| G{"Pass?"} G -->|"Yes"| H["๐Ÿš€ Ship"] G -->|"No"| A style A fill:#3B9797,stroke:#3B9797,color:#fff style H fill:#132440,stroke:#132440,color:#fff style G fill:#fff5f5,stroke:#BF092F,color:#132440

Power Optimization

Sleep Modes

STM32 Low-Power Modes

ModeCurrentWake SourcesResume Time
Run10–100 mAN/AN/A
Sleep1–10 mAAny interrupt<1 µs
Stop2–100 µAEXTI, RTC, LPUART5–50 µs
Standby0.3–3 µAWKUP pin, RTC50 µs (reset)
Shutdown20–100 nAWKUP pin onlyFull reboot
// STM32 Stop Mode with RTC wakeup every 5 seconds
#include "stm32l4xx.h"

void enter_stop_mode(uint32_t seconds) {
    // Configure RTC wakeup timer
    RTC->WPR = 0xCA;
    RTC->WPR = 0x53;  // Unlock RTC
    RTC->CR &= ~RTC_CR_WUTE;  // Disable wakeup timer
    while (!(RTC->ISR & RTC_ISR_WUTWF));  // Wait

    RTC->WUTR = seconds - 1;  // Wakeup after N seconds
    RTC->CR |= RTC_CR_WUTIE | RTC_CR_WUTE | (4U << 0);  // 1 Hz clock

    // Configure STOP2 mode
    SCB->SCR |= SCB_SCR_SLEEPDEEP_Msk;
    PWR->CR1 = (PWR->CR1 & ~PWR_CR1_LPMS) | PWR_CR1_LPMS_STOP2;

    __WFI();  // Wait for interrupt โ€” enters STOP2

    // Wakes up here โ€” reconfigure clocks
    SystemClock_Config();
}

// Usage: duty-cycle sensor reading
int main(void) {
    SystemInit();
    sensor_init();

    while (1) {
        float temp = read_temperature();
        transmit_data(temp);

        enter_stop_mode(5);  // Sleep 5 seconds โ†’ 2 ยตA
    }
}

Power Budgeting

Power Budget Example (Battery-Powered Sensor Node):
  • MCU (STM32L4, Stop2): 2 µA × 99% = 1.98 µA average
  • MCU (Run, 10ms every 5s): 5 mA × 0.2% = 10 µA average
  • Sensor (BME280 forced mode): 0.1 µA sleep + 350 µA × 0.01% = 0.14 µA average
  • Radio (LoRa TX, 100ms every 60s): 120 mA × 0.17% = 200 µA average
  • Total: ~212 µA average
  • Battery life (2000 mAh CR123A): 2000 / 0.212 = ~9,400 hours = ~13 months

Fault Tolerance

Watchdog Timers

// Independent Watchdog (IWDG) on STM32
#include "stm32f4xx.h"

void iwdg_init(uint32_t timeout_ms) {
    IWDG->KR  = 0x5555;                   // Enable register access
    IWDG->PR  = 4;                         // Prescaler /64 โ†’ 500 Hz
    IWDG->RLR = (timeout_ms * 500) / 1000; // Reload value
    IWDG->KR  = 0xCCCC;                   // Start watchdog
}

void iwdg_refresh(void) {
    IWDG->KR = 0xAAAA;  // Reset countdown
}

// Main loop must call iwdg_refresh() within timeout period
// If software hangs, watchdog resets the MCU
int main(void) {
    SystemInit();
    iwdg_init(2000);  // 2-second timeout

    while (1) {
        read_sensors();
        run_control();
        drive_actuators();

        iwdg_refresh();  // Feed the watchdog
    }
}

Sensor Redundancy

Redundancy Strategies:
  • Dual Modular Redundancy (DMR): Two identical sensors. Detect disagreement but cannot determine which is faulty
  • Triple Modular Redundancy (TMR): Three sensors with majority voting. Tolerates one sensor failure
  • Dissimilar Redundancy: Different sensor technologies measuring same quantity (e.g., accelerometer + gyro for orientation). Protects against systematic sensor failures
  • Analytical Redundancy: Use mathematical models to estimate expected sensor values. Detect faults by comparing model predictions vs actual readings

Conclusion & Next Steps

Advanced embedded systems go beyond basic sensor reading and actuator driving. Sensor fusion extracts maximum accuracy from imperfect sensors, TinyML enables on-device intelligence, power optimization extends battery life from days to years, and fault tolerance ensures systems operate safely under adverse conditions.

Key Takeaways:
  • Complementary filters are simple and effective for IMU fusion (2 lines of code)
  • Kalman filters provide optimal estimation for linear Gaussian systems
  • TinyML enables ML inference on MCUs with as little as 16 KB RAM
  • Stop mode + duty-cycling can achieve <1 µA average current
  • Watchdog timers are essential for production embedded systems reliability

In Part 10, we cover System Design & Architecture — PCB design for sensor systems, software architecture patterns, testing methodologies, and debugging tools.