System Overview
| Parameter | Specification | Component |
|---|---|---|
| MCU | ARM Cortex-M7 @ 480 MHz | STM32H743 |
| Camera | 5MP, QVGA for inference | OV5640 (DCMI) |
| RAM | 1MB internal + 8MB SDRAM | IS42S16400J |
| Flash | 2MB internal + 64MB QSPI | W25Q512 |
| Display | 2.4" TFT (320×240) | ILI9341 (SPI) |
| Connectivity | WiFi + BLE | ESP32-C3 (UART) |
| Model | MobileNet v2 (96×96 INT8) | TFLite Micro |
| Inference | <100 ms per frame | CMSIS-NN accelerated |
flowchart LR
A["OV5640
Camera"] -->|DCMI| B["DMA
Transfer"]
B --> C["Frame Buffer
SDRAM"]
C --> D["Resize &
Preprocess"]
D --> E["TFLite Micro
Inference"]
E --> F["Post-process
NMS"]
F --> G["Display
ILI9341"]
F --> H["WiFi/BLE
ESP32-C3"]
style E fill:#3B9797,color:#fff
style A fill:#16476A,color:#fff
Image Pipeline
/* DCMI configuration for OV5640 → DMA → frame buffer
Captures QVGA (320x240) RGB565 frames */
#include <stdint.h>
#define FRAME_WIDTH 320
#define FRAME_HEIGHT 240
#define BYTES_PER_PX 2 /* RGB565 */
#define FRAME_SIZE (FRAME_WIDTH * FRAME_HEIGHT * BYTES_PER_PX)
/* Frame buffers in external SDRAM (double-buffered) */
/* Place at SDRAM base: 0xC0000000 */
uint16_t frame_buf_0[FRAME_WIDTH * FRAME_HEIGHT]; /* 150 KB */
uint16_t frame_buf_1[FRAME_WIDTH * FRAME_HEIGHT]; /* 150 KB */
/* Inference buffer: 96x96 grayscale (INT8) */
int8_t inference_buf[96 * 96]; /* 9.2 KB in DTCM */
/* Downsample and convert RGB565 → grayscale INT8 */
void preprocess_frame(const uint16_t *src, int8_t *dst,
int src_w, int src_h,
int dst_w, int dst_h) {
float x_ratio = (float)src_w / dst_w;
float y_ratio = (float)src_h / dst_h;
for (int y = 0; y < dst_h; y++) {
for (int x = 0; x < dst_w; x++) {
int sx = (int)(x * x_ratio);
int sy = (int)(y * y_ratio);
uint16_t pixel = src[sy * src_w + sx];
/* RGB565 → grayscale: 0.299R + 0.587G + 0.114B */
uint8_t r = (pixel >> 11) & 0x1F;
uint8_t g = (pixel >> 5) & 0x3F;
uint8_t b = pixel & 0x1F;
/* Scale to 8-bit and compute luminance */
uint8_t gray = (uint8_t)(
(r * 255 / 31) * 77 / 256 +
(g * 255 / 63) * 150 / 256 +
(b * 255 / 31) * 29 / 256
);
/* Quantize to INT8: [-128, 127] */
dst[y * dst_w + x] = (int8_t)(gray - 128);
}
}
}
ML Model Deployment
# Model quantization — convert Keras model to TFLite INT8
# Suitable for STM32H7 with CMSIS-NN acceleration
import numpy as np
# Simulated model metrics for deployment planning
model_params = {
"architecture": "MobileNet V2 (alpha=0.35)",
"input_shape": (96, 96, 1),
"num_classes": 10,
"float32_size_kb": 1420,
"int8_size_kb": 380,
"float32_latency_ms": 850,
"int8_latency_ms": 92,
"accuracy_float32": 0.945,
"accuracy_int8": 0.938,
}
# Memory layout for STM32H743
memory_map = {
"ITCM (instructions)": (64, "Inference engine code"),
"DTCM (fast data)": (128, "Model weights (hot layers)"),
"SRAM1": (512, "Tensor arena"),
"SRAM2": (288, "Scratch buffers"),
"SDRAM": (8192, "Frame buffers + display"),
"QSPI Flash": (65536, "Full model + assets"),
}
print("Edge AI Camera — Model Deployment Plan")
print("=" * 55)
print(f"Model: {model_params['architecture']}")
print(f"Input: {model_params['input_shape']}")
print(f"Classes: {model_params['num_classes']}")
print(f"\nQuantization Comparison:")
print(f" Float32: {model_params['float32_size_kb']} KB, "
f"{model_params['float32_latency_ms']} ms, "
f"{model_params['accuracy_float32']:.1%} accuracy")
print(f" INT8: {model_params['int8_size_kb']} KB, "
f"{model_params['int8_latency_ms']} ms, "
f"{model_params['accuracy_int8']:.1%} accuracy")
print(f" Speedup: {model_params['float32_latency_ms']/model_params['int8_latency_ms']:.1f}x")
print(f" Size reduction: {1 - model_params['int8_size_kb']/model_params['float32_size_kb']:.0%}")
print(f"\nMemory Map:")
total_used = 0
for region, (size_kb, usage) in memory_map.items():
print(f" {region:<25} {size_kb:>6} KB — {usage}")
total_used += size_kb
print(f" {'Total':<25} {total_used:>6} KB")
Memory Architecture
flowchart TD
A["QSPI Flash
64MB — Model Storage"] -->|XIP or Copy| B["DTCM
128KB — Hot Weights"]
A -->|DMA| C["SRAM1
512KB — Tensor Arena"]
D["DCMI + DMA"] --> E["SDRAM
8MB — Frame Buffers"]
E -->|Preprocess| C
C --> F["CMSIS-NN
Inference Engine"]
F --> G["Results
Bounding Boxes"]
B --> F
style F fill:#3B9797,color:#fff
style A fill:#132440,color:#fff
Inference Performance Estimator
Edge AI Camera Performance Estimator
Estimate inference performance for your model and target MCU. Download as Word, Excel, or PDF.
Conclusion
The edge AI camera capstone demonstrates how to build a complete vision pipeline — from DCMI capture through preprocessing, INT8 inference with CMSIS-NN, and result display — all on a Cortex-M7 MCU without a cloud connection. The quantized MobileNet V2 model achieves <100ms inference at 93.8% accuracy.
Next Capstone
In Capstone 4: Home Automation Hub, we’ll design a multi-protocol gateway combining WiFi, Zigbee, and Bluetooth Mesh into a unified smart home controller.