Sensor Fusion
Sensor fusion combines data from multiple sensors to produce a more accurate, reliable, or complete measurement than any single sensor can provide. The classic example is IMU fusion: accelerometers provide accurate orientation under static conditions but are noisy with vibration, while gyroscopes provide smooth angular rate but drift over time.
flowchart LR
ACC["๐ Accelerometer
Accurate but noisy"] --> FUSE
GYRO["๐ Gyroscope
Smooth but drifts"] --> FUSE
MAG["๐งญ Magnetometer
Heading reference"] --> FUSE
FUSE["๐ง Fusion
Algorithm"] --> OUT["โ
Fused Estimate
Accurate + Stable"]
style FUSE fill:#3B9797,stroke:#3B9797,color:#fff
style OUT fill:#132440,stroke:#132440,color:#fff
style ACC fill:#e8f4f4,stroke:#3B9797,color:#132440
style GYRO fill:#f0f4f8,stroke:#16476A,color:#132440
style MAG fill:#e8f4f4,stroke:#3B9797,color:#132440
Complementary Filter
The simplest fusion algorithm. It combines accelerometer (low-pass) and gyroscope (high-pass) data with a tuning parameter $\alpha$ (typically 0.96–0.98):
$$\theta_{fused} = \alpha \cdot (\theta_{prev} + \omega_{gyro} \cdot \Delta t) + (1 - \alpha) \cdot \theta_{accel}$$
// Complementary filter for pitch/roll from IMU
#include <math.h>
#define ALPHA 0.98f
#define RAD_TO_DEG 57.2957795f
typedef struct {
float pitch; // degrees
float roll; // degrees
} Orientation;
static Orientation orient = {0, 0};
Orientation complementary_filter(float ax, float ay, float az,
float gx, float gy, float gz,
float dt) {
// Gyroscope integration (high-pass: captures fast changes)
orient.pitch += gx * dt;
orient.roll += gy * dt;
// Accelerometer angle (low-pass: captures gravity direction)
float accel_pitch = atan2f(ay, sqrtf(ax * ax + az * az)) * RAD_TO_DEG;
float accel_roll = atan2f(-ax, az) * RAD_TO_DEG;
// Fuse: trust gyro for fast changes, accel for steady-state
orient.pitch = ALPHA * orient.pitch + (1.0f - ALPHA) * accel_pitch;
orient.roll = ALPHA * orient.roll + (1.0f - ALPHA) * accel_roll;
return orient;
}
Kalman Filter
The Kalman filter is the optimal estimator for linear systems with Gaussian noise. It operates in two phases: predict (project state forward using the model) and update (correct with new measurement).
$$\hat{x}_{k|k-1} = F \hat{x}_{k-1|k-1} + B u_k \quad \text{(Predict)}$$ $$P_{k|k-1} = F P_{k-1|k-1} F^T + Q$$ $$K_k = P_{k|k-1} H^T (H P_{k|k-1} H^T + R)^{-1} \quad \text{(Kalman Gain)}$$ $$\hat{x}_{k|k} = \hat{x}_{k|k-1} + K_k (z_k - H \hat{x}_{k|k-1}) \quad \text{(Update)}$$ $$P_{k|k} = (I - K_k H) P_{k|k-1}$$
// 1D Kalman filter for sensor smoothing
typedef struct {
float x; // State estimate
float p; // Estimate covariance
float q; // Process noise
float r; // Measurement noise
float k; // Kalman gain
} Kalman1D;
void kalman_init(Kalman1D *kf, float initial, float p, float q, float r) {
kf->x = initial;
kf->p = p;
kf->q = q;
kf->r = r;
}
float kalman_update(Kalman1D *kf, float measurement) {
// Predict
kf->p += kf->q;
// Update
kf->k = kf->p / (kf->p + kf->r);
kf->x = kf->x + kf->k * (measurement - kf->x);
kf->p = (1.0f - kf->k) * kf->p;
return kf->x;
}
Madgwick / Mahony Filters
- Madgwick: Gradient descent optimization. Single tuning parameter (beta). Handles magnetic distortion. 40–100 µs on Cortex-M4
- Mahony: Complementary filter with PI correction. Two tuning parameters (Kp, Ki). Lighter than Madgwick. Popular in flight controllers
- Extended Kalman Filter (EKF): Linearizes nonlinear system. Most accurate but heaviest. Used in high-end INS/GPS fusion
- Unscented Kalman Filter (UKF): Uses sigma points instead of linearization. Better for highly nonlinear systems
TinyML
TensorFlow Lite for Microcontrollers
TinyML brings machine learning inference to microcontrollers with as little as 16 KB of RAM. Common applications include keyword spotting, gesture recognition, anomaly detection, and predictive maintenance — all running locally without cloud connectivity.
TinyML Platform Comparison
| Framework | Min RAM | MCU Support | Features |
|---|---|---|---|
| TF Lite Micro | 16 KB | Cortex-M0+ to M7 | Quantized models, CMSIS-NN acceleration |
| Edge Impulse | Varies | Arduino, STM32, Nordic | End-to-end pipeline, data collection |
| STM32Cube.AI | Varies | STM32 only | Direct TF/ONNX import, optimized for STM32 |
| microTVM (Apache TVM) | Varies | Multiple | Compiler-based optimization |
ML Workflow for Embedded
- Collect: Gather sensor data from target hardware (accelerometer, microphone, etc.)
- Train: Train model in Python (TensorFlow/Keras) on PC or cloud
- Quantize: Convert float32 weights to int8 (4x size reduction, 2-4x speed improvement)
- Convert: Export as .tflite FlatBuffer model file
- Deploy: Include as C array in firmware, run inference via TF Lite Micro interpreter
- Validate: Compare on-device accuracy vs PC baseline
flowchart LR
subgraph PC["โ๏ธ PC / Cloud"]
A["๐ Collect
Training Data"] --> B["๐ง Train
Model"]
B --> C["๐ฆ Quantize
INT8/Float16"]
C --> D["๐ Convert
TF Lite Micro"]
end
subgraph MCU["๐ Microcontroller"]
E["โฌ๏ธ Deploy
C Array in Flash"]
F["โ
Validate
On-device Accuracy"]
end
D --> E --> F
F -.->|"Accuracy OK?"| G{"Pass?"}
G -->|"Yes"| H["๐ Ship"]
G -->|"No"| A
style A fill:#3B9797,stroke:#3B9797,color:#fff
style H fill:#132440,stroke:#132440,color:#fff
style G fill:#fff5f5,stroke:#BF092F,color:#132440
Power Optimization
Sleep Modes
STM32 Low-Power Modes
| Mode | Current | Wake Sources | Resume Time |
|---|---|---|---|
| Run | 10–100 mA | N/A | N/A |
| Sleep | 1–10 mA | Any interrupt | <1 µs |
| Stop | 2–100 µA | EXTI, RTC, LPUART | 5–50 µs |
| Standby | 0.3–3 µA | WKUP pin, RTC | 50 µs (reset) |
| Shutdown | 20–100 nA | WKUP pin only | Full reboot |
// STM32 Stop Mode with RTC wakeup every 5 seconds
#include "stm32l4xx.h"
void enter_stop_mode(uint32_t seconds) {
// Configure RTC wakeup timer
RTC->WPR = 0xCA;
RTC->WPR = 0x53; // Unlock RTC
RTC->CR &= ~RTC_CR_WUTE; // Disable wakeup timer
while (!(RTC->ISR & RTC_ISR_WUTWF)); // Wait
RTC->WUTR = seconds - 1; // Wakeup after N seconds
RTC->CR |= RTC_CR_WUTIE | RTC_CR_WUTE | (4U << 0); // 1 Hz clock
// Configure STOP2 mode
SCB->SCR |= SCB_SCR_SLEEPDEEP_Msk;
PWR->CR1 = (PWR->CR1 & ~PWR_CR1_LPMS) | PWR_CR1_LPMS_STOP2;
__WFI(); // Wait for interrupt โ enters STOP2
// Wakes up here โ reconfigure clocks
SystemClock_Config();
}
// Usage: duty-cycle sensor reading
int main(void) {
SystemInit();
sensor_init();
while (1) {
float temp = read_temperature();
transmit_data(temp);
enter_stop_mode(5); // Sleep 5 seconds โ 2 ยตA
}
}
Power Budgeting
- MCU (STM32L4, Stop2): 2 µA × 99% = 1.98 µA average
- MCU (Run, 10ms every 5s): 5 mA × 0.2% = 10 µA average
- Sensor (BME280 forced mode): 0.1 µA sleep + 350 µA × 0.01% = 0.14 µA average
- Radio (LoRa TX, 100ms every 60s): 120 mA × 0.17% = 200 µA average
- Total: ~212 µA average
- Battery life (2000 mAh CR123A): 2000 / 0.212 = ~9,400 hours = ~13 months
Fault Tolerance
Watchdog Timers
// Independent Watchdog (IWDG) on STM32
#include "stm32f4xx.h"
void iwdg_init(uint32_t timeout_ms) {
IWDG->KR = 0x5555; // Enable register access
IWDG->PR = 4; // Prescaler /64 โ 500 Hz
IWDG->RLR = (timeout_ms * 500) / 1000; // Reload value
IWDG->KR = 0xCCCC; // Start watchdog
}
void iwdg_refresh(void) {
IWDG->KR = 0xAAAA; // Reset countdown
}
// Main loop must call iwdg_refresh() within timeout period
// If software hangs, watchdog resets the MCU
int main(void) {
SystemInit();
iwdg_init(2000); // 2-second timeout
while (1) {
read_sensors();
run_control();
drive_actuators();
iwdg_refresh(); // Feed the watchdog
}
}
Sensor Redundancy
- Dual Modular Redundancy (DMR): Two identical sensors. Detect disagreement but cannot determine which is faulty
- Triple Modular Redundancy (TMR): Three sensors with majority voting. Tolerates one sensor failure
- Dissimilar Redundancy: Different sensor technologies measuring same quantity (e.g., accelerometer + gyro for orientation). Protects against systematic sensor failures
- Analytical Redundancy: Use mathematical models to estimate expected sensor values. Detect faults by comparing model predictions vs actual readings
Conclusion & Next Steps
Advanced embedded systems go beyond basic sensor reading and actuator driving. Sensor fusion extracts maximum accuracy from imperfect sensors, TinyML enables on-device intelligence, power optimization extends battery life from days to years, and fault tolerance ensures systems operate safely under adverse conditions.
- Complementary filters are simple and effective for IMU fusion (2 lines of code)
- Kalman filters provide optimal estimation for linear Gaussian systems
- TinyML enables ML inference on MCUs with as little as 16 KB RAM
- Stop mode + duty-cycling can achieve <1 µA average current
- Watchdog timers are essential for production embedded systems reliability
In Part 10, we cover System Design & Architecture — PCB design for sensor systems, software architecture patterns, testing methodologies, and debugging tools.