Back to Embedded Systems Hardware Engineering Series

Part 9: Board Bring-Up & Debugging

April 17, 2026 Wasil Zafar 45 min read

The moment of truth — power on, smoke test, verify every rail, flash firmware, and systematically debug your freshly assembled board.

Table of Contents

  1. Bring-Up Strategy
  2. Smoke Test
  3. Power Rail Verification
  4. Firmware Flashing
  5. Debug Instruments
  6. Bring-Up Log Tool
  7. Exercises
  8. Conclusion & Next Steps

Bring-Up Strategy

Analogy Board bring-up is like a pilot’s pre-flight checklist. A pilot doesn’t just turn the key and take off — they systematically check fuel levels (power rails), test control surfaces (peripherals), verify instruments (debug tools), and do a low-speed taxi (current-limited power-on) before committing to full takeoff. Skip a step, and you risk a catastrophic failure. The same discipline applies to your PCB: visual inspection → continuity → current-limited power → rail checks → firmware → peripherals → full test.

Board bring-up is the systematic process of verifying a newly assembled PCB. Never power on a new board without a plan — a single short circuit can destroy expensive components. Follow a structured checklist from visual inspection to full functional test.

A Brief History of Hardware Debugging

1947 Grace Hopper finds a literal moth trapped in relay #70 of the Harvard Mark II computer — coining the term “debugging” (the moth is preserved at the Smithsonian)
1980 HP 54100A, one of the first digital storage oscilloscopes, enables engineers to capture single-shot events — revolutionary for intermittent hardware bugs
2005 OpenOCD (Open On-Chip Debugger) released as open-source, democratising JTAG/SWD debugging that previously required $5,000+ commercial tools
2012 Saleae Logic 8 launches, bringing protocol-aware logic analysis to hobbyists at $150 — previously a $2,000+ capability
Board Bring-Up Sequence
flowchart TD
    A["Visual Inspection"] --> B["Continuity Check"]
    B --> C["Current-Limited
Power-On"] C --> D{"Smoke?
Overcurrent?"} D -->|Yes| E["Power OFF
Investigate"] D -->|No| F["Measure Power Rails"] F --> G{"All Rails
Within Spec?"} G -->|No| E G -->|Yes| H["Flash Firmware
via SWD"] H --> I["Verify Clock &
Oscillator"] I --> J["Test Peripherals
(UART, I2C, SPI)"] J --> K["Full Functional
Test"]

Essential Equipment

EquipmentBudget PickPro PickPurpose
Bench PSUWanptek 30V/5ARigol DP832Current-limited power
MultimeterUNI-T UT61E+Fluke 87VVoltage, resistance, continuity
OscilloscopeRigol DS1054ZSiglent SDS2104X+Waveforms, noise, timing
Logic analyzerSaleae Logic 8Saleae Logic Pro 16Digital protocol decode
SWD debuggerST-Link V2 cloneJ-Link EDUFirmware flash & debug
MicroscopeUSB microscopeAmscope stereoSolder joint inspection

Smoke Test

Visual Inspection Checklist

Before applying any power, inspect every board under magnification:
  • Solder bridges between IC pins (especially QFP/QFN)
  • Missing components or tombstoned passives
  • Reversed polarity on diodes, capacitors, ICs
  • Correct component values (check markings vs BOM)
  • Clean solder joints (shiny, concave fillets)
  • No flux residue on high-impedance nodes

First Power-On Procedure

# First power-on checklist with current monitoring
import time

# Bench power supply settings
VOLTAGE_LIMIT = 5.0    # V — match your input voltage
CURRENT_LIMIT = 0.100  # A — start very low (100mA)

# Expected quiescent currents (no firmware running)
expected_currents = {
    "STM32F411 (sleep)": 0.002,   # ~2mA quiescent
    "LDO (3.3V reg)":   0.005,   # ~5mA quiescent
    "LEDs (off)":        0.000,   # 0mA when off
    "Total expected":    0.010,   # ~10mA quiescent
}

print("FIRST POWER-ON PROCEDURE")
print("=" * 50)
print(f"Step 1: Set bench PSU to {VOLTAGE_LIMIT}V, {CURRENT_LIMIT*1000:.0f}mA limit")
print(f"Step 2: Connect DMM in series to monitor current")
print(f"Step 3: Power ON — observe current immediately")
print()

for component, current in expected_currents.items():
    print(f"  {component:>25}: {current*1000:>6.1f} mA")

print(f"\nStep 4: If current > {CURRENT_LIMIT*1000:.0f}mA → POWER OFF immediately")
print(f"Step 5: If current stable at ~10mA → proceed to rail checks")
print(f"Step 6: Touch IC packages — nothing should be hot")
print(f"\n⚠ If any component is hot, power off and inspect!")
Output
FIRST POWER-ON PROCEDURE
==================================================
Step 1: Set bench PSU to 5.0V, 100mA limit
Step 2: Connect DMM in series to monitor current
Step 3: Power ON — observe current immediately

     STM32F411 (sleep):    2.0 mA
      LDO (3.3V reg):    5.0 mA
          LEDs (off):    0.0 mA
       Total expected:   10.0 mA

Step 4: If current > 100mA → POWER OFF immediately
Step 5: If current stable at ~10mA → proceed to rail checks
Step 6: Touch IC packages — nothing should be hot

⚠ If any component is hot, power off and inspect!

Power Rail Verification

Measuring Each Rail

# Power rail verification log
rails = [
    {"name": "VBUS (USB 5V)",  "expected": 5.00, "tolerance": 0.25, "test_point": "TP1"},
    {"name": "VCC_3V3",        "expected": 3.30, "tolerance": 0.10, "test_point": "TP2"},
    {"name": "VCC_1V8 (core)", "expected": 1.80, "tolerance": 0.05, "test_point": "TP3"},
    {"name": "VREF_ADC",       "expected": 3.30, "tolerance": 0.03, "test_point": "TP4"},
    {"name": "VBAT (backup)",  "expected": 3.00, "tolerance": 0.30, "test_point": "TP5"},
]

print("Power Rail Verification")
print("=" * 75)
print(f"{'Rail':>18} | {'Expected':>9} | {'Tolerance':>10} | {'TP':>5} | {'Status':>8}")
print("-" * 75)

# Simulated measurements (replace with actual readings)
measured = [5.02, 3.28, 1.79, 3.31, 2.98]

for rail, meas in zip(rails, measured):
    exp = rail["expected"]
    tol = rail["tolerance"]
    within = abs(meas - exp) <= tol
    status = "✓ PASS" if within else "✗ FAIL"
    print(f"{rail['name']:>18} | {exp:>7.2f} V | ±{tol:>6.2f} V | {rail['test_point']:>5} | {status:>8}")
    if within:
        print(f"{'':>18}   Measured: {meas:.3f} V (Δ = {abs(meas-exp)*1000:.1f} mV)")

print("\nTip: Measure with scope for AC ripple (DC offset + AC coupling)")
Output
Power Rail Verification
===========================================================================
              Rail |  Expected |  Tolerance |    TP |   Status
---------------------------------------------------------------------------
    VBUS (USB 5V)  |   5.00 V  | ±  0.25 V  |   TP1 |   ✓ PASS
                     Measured: 5.020 V (Δ = 20.0 mV)
         VCC_3V3   |   3.30 V  | ±  0.10 V  |   TP2 |   ✓ PASS
                     Measured: 3.280 V (Δ = 20.0 mV)
  VCC_1V8 (core)   |   1.80 V  | ±  0.05 V  |   TP3 |   ✓ PASS
                     Measured: 1.790 V (Δ = 10.0 mV)
        VREF_ADC   |   3.30 V  | ±  0.03 V  |   TP4 |   ✓ PASS
                     Measured: 3.310 V (Δ = 10.0 mV)
   VBAT (backup)   |   3.00 V  | ±  0.30 V  |   TP5 |   ✓ PASS
                     Measured: 2.980 V (Δ = 20.0 mV)

Tip: Measure with scope for AC ripple (DC offset + AC coupling)
Case Study
Toyota Unintended Acceleration (2009–2011) — The Debugging Investigation

Between 2009 and 2011, Toyota recalled 9 million vehicles for unintended acceleration that caused 89 deaths. While Toyota blamed floor mats and sticky pedals, NASA engineers were brought in to investigate the electronic throttle control (ETC) system — one of the most thorough embedded hardware debugging investigations in history.

The investigation process: NASA’s team spent 10 months performing: (1) board-level X-ray and visual inspection of 58 ECU boards, (2) power rail analysis under electromagnetic interference (EMI), (3) JTAG boundary scan of the Renesas V850 MCU to verify memory contents, (4) oscilloscope capture of throttle position sensor signals during simulated fault conditions, and (5) firmware source code review of 280,000 lines of C code.

What they found: While NASA couldn’t definitively prove a software bug caused the unintended acceleration, independent experts later identified: single-bit RAM corruption risk (no ECC on the V850), task stack overflow vulnerabilities, and insufficient watchdog coverage. The A/D converter reading the throttle position had no hardware redundancy — a single corrupted ADC reading could command full throttle.

Bring-up lesson: During board bring-up, always verify: (1) ADC readings under noise and EMI conditions, (2) watchdog timer fires correctly, (3) critical signals have hardware redundancy (dual ADC channels, voting logic), (4) firmware handles sensor faults gracefully. A proper bring-up would have caught the missing ADC redundancy on the very first board.

9M Vehicles Recalled NASA Investigation No ADC Redundancy 10-Month Debug

Ripple & Noise Measurement

Measuring power rail ripple correctly: (1) Use a short ground lead or spring-tip ground on your scope probe — the standard ground clip adds inductance that creates false ringing. (2) Set scope to AC coupling, 20MHz bandwidth limit. (3) Measure peak-to-peak ripple, not RMS. (4) Target: <50mV p-p for 3.3V digital rails, <10mV p-p for analog/ADC reference.

Firmware Flashing

SWD Connection

# Verify SWD connection with OpenOCD
# Connect ST-Link V2 to board's SWD header
# SWDIO → Pin 2, SWCLK → Pin 4, GND → Pin 3, VCC → Pin 1

# Test connection (STM32F411)
openocd -f interface/stlink.cfg \
        -f target/stm32f4x.cfg \
        -c "init; targets; exit"

# Expected output:
# Info : STLINK V2J37S7 (API v2) VID:PID 0483:3748
# Info : Target voltage: 3.294346
# Info : stm32f4x.cpu: Cortex-M4 r0p1 processor detected
# Info : stm32f4x.cpu: target state: halted

OpenOCD Flash & Debug

# Flash firmware via OpenOCD + GDB
# Step 1: Start OpenOCD server
openocd -f interface/stlink.cfg -f target/stm32f4x.cfg &

# Step 2: Connect with GDB
arm-none-eabi-gdb firmware.elf

# In GDB:
# (gdb) target remote :3333
# (gdb) monitor reset halt
# (gdb) load
# (gdb) monitor reset run

# Alternative: Flash directly with OpenOCD
openocd -f interface/stlink.cfg \
        -f target/stm32f4x.cfg \
        -c "program firmware.bin verify reset exit 0x08000000"

# Using STM32CubeProgrammer CLI
STM32_Programmer_CLI -c port=SWD freq=4000 \
    -w firmware.bin 0x08000000 -v -rst

Debug Instruments

Oscilloscope Techniques

Essential Measurements
Top 5 Scope Measurements for Bring-Up
  1. Power rail ripple: AC-coupled, 20MHz BW limit, short ground lead. Verify <50mV p-p.
  2. Clock signal: Check oscillator output frequency, duty cycle (target 45–55%), rise/fall times.
  3. UART waveform: Verify baud rate by measuring bit period (e.g., 115200 baud = 8.68µs/bit).
  4. I2C timing: Check SCL frequency, SDA setup/hold times, pull-up strength (rise time <300ns for 400kHz).
  5. SPI signals: Verify SCLK frequency, CPOL/CPHA mode, CS timing vs first clock edge.

Logic Analyzer Protocol Decode

# Logic analyzer capture analysis — UART decode example
# Using sigrok/PulseView or Saleae Logic

# UART timing calculator
baud_rates = [9600, 115200, 460800, 921600, 1000000]

print("UART Bit Timing Reference")
print("=" * 55)
print(f"{'Baud Rate':>12} | {'Bit Period':>12} | {'Byte Time':>12} | {'1KB Time':>10}")
print("-" * 55)

for baud in baud_rates:
    bit_us = 1e6 / baud               # microseconds per bit
    byte_us = 10 * bit_us             # 10 bits per byte (start + 8 data + stop)
    kb_ms = 1024 * byte_us / 1000     # ms to send 1KB
    print(f"{baud:>12,} | {bit_us:>9.2f} µs | {byte_us:>9.1f} µs | {kb_ms:>7.1f} ms")

print("\nTip: Measure bit period on scope to verify actual baud rate")
print("Tip: Use logic analyzer protocol decoder for I2C/SPI/UART")
Output
UART Bit Timing Reference
=======================================================
   Baud Rate |   Bit Period |    Byte Time |   1KB Time
-------------------------------------------------------
       9,600 |   104.17 µs  |   1041.7 µs  |  1066.7 ms
     115,200 |     8.68 µs  |     86.8 µs  |    88.9 ms
     460,800 |     2.17 µs  |     21.7 µs  |    22.2 ms
     921,600 |     1.09 µs  |     10.9 µs  |    11.1 ms
   1,000,000 |     1.00 µs  |     10.0 µs  |    10.2 ms

Tip: Measure bit period on scope to verify actual baud rate
Tip: Use logic analyzer protocol decoder for I2C/SPI/UART
Case Study
Mars Pathfinder Priority Inversion Bug (1997) — Debugging from 190 Million km Away

On July 4, 1997, NASA’s Mars Pathfinder lander successfully touched down on Mars. But days later, the system began randomly resetting — losing data and interrupting the Sojourner rover’s mission. The reset occurred approximately every few hours with no predictable pattern.

Remote debugging: With the lander 190 million km away, engineers had to debug entirely through telemetry logs and a replica system on Earth. Using the replica, they reproduced the fault: a classic priority inversion in VxWorks RTOS. A low-priority meteorological task held a mutex on the system bus, while a high-priority data distribution task waited for that mutex. A medium-priority communications task would then preempt the low-priority task — preventing it from releasing the mutex. The watchdog timer eventually fired, resetting the entire system.

The fix: Engineers uploaded a 6KB patch (yes, over radio to Mars) that enabled VxWorks’ priority inheritance protocol on the shared mutex. The low-priority task would temporarily inherit the high-priority task’s priority, preventing preemption. The flag MUT_INHERIT had existed in VxWorks all along — it just wasn’t enabled during initial configuration.

Bring-up lesson: During firmware bring-up, always stress-test RTOS configurations under load. Run all tasks simultaneously at high frequency and monitor for watchdog resets, stack overflows, and mutex timeouts. The Pathfinder bug was easily reproducible on the ground — it just wasn’t tested under realistic concurrent load.

190M km Debug Priority Inversion 6KB Patch to Mars VxWorks RTOS

Multimeter Diagnostics

Key multimeter checks during bring-up:
  • Continuity: Verify GND connections between all ground pins. Check power rail continuity from connector to IC supply pins.
  • Resistance: Measure between VCC and GND — should be >1kΩ (low resistance = short circuit). Check pull-up/pull-down values.
  • Diode check: Verify protection diodes in correct orientation. Check ESD diode clamp voltages.
  • Voltage: DC rails with 4.5-digit accuracy. Use relative mode to measure small differences between rails.

Bring-Up Log Tool

Generate a structured bring-up checklist and test log for your board, documenting each step from visual inspection through functional testing.

Board Bring-Up Log

Document your board bring-up procedure and results. Download as Word, Excel, or PDF.

Draft auto-saved

Exercises

Exercise 1: Diagnose a Failed Bring-Up

You power on a new STM32F4 board with a current-limited supply at 5V/100mA. The supply immediately hits the 100mA current limit and the voltage drops to 2.1V. The board draws 100mA even before firmware runs.

  1. What are the three most likely causes? (Think: shorts, wrong components, assembly defects)
  2. Describe the exact measurement sequence you would use to isolate the fault (which test points, which instruments)
  3. You find the 3.3V rail measures 0.15V. What does this tell you?
  4. Using a thermal camera, you find the LDO regulator is at 95°C. The LDO is rated for 150mA with 5V→3.3V dropout. What is the most likely root cause?

Hint: A 0.15V rail means the LDO is in current-limit or short-circuit protection. Check for solder bridges on the 3.3V net, especially near decoupling capacitors. The LDO thermal dissipation is P = (5.0 - 0.15) × I_SC — at 95°C, calculate the current.

Exercise 2: UART Debug Investigation

You’ve flashed firmware that should print “Hello World” at 115200 baud on UART2 (PA2/PA3). Your USB-UART adapter shows garbage characters in the terminal.

  1. List all possible causes in order of likelihood (baud rate, pin mapping, voltage levels, clock source)
  2. You measure the TX pin with a scope and see: bit period = 9.6µs, voltage swings 0–3.3V. Is the baud rate correct? Calculate the actual baud rate.
  3. The 9.6µs bit period suggests what clock misconfiguration? (HSI vs HSE, PLL settings)
  4. Write the debug steps to verify the HSE oscillator is running correctly

Hint: 9.6µs/bit = 104,167 baud, not 115,200. The ratio is 115200/104167 = 1.106, which is very close to 8/7.2 = 1.111. This suggests the MCU is running from the 8MHz HSI internal oscillator instead of an expected HSE crystal, causing UART clock to be off by ~10%.

Exercise 3: Power Rail Noise Analysis

Your 3.3V rail measures 3.31V DC with a multimeter (✓ PASS). But the ADC readings on PA0 (connected to a voltage divider from a potentiometer) show random ±50 LSB fluctuations on a 12-bit ADC.

  1. Convert the ±50 LSB fluctuation to millivolts (VREF = 3.3V, 12-bit = 4096 steps)
  2. Set up your oscilloscope to measure the 3.3V rail ripple. What coupling mode, bandwidth limit, and probe grounding technique should you use?
  3. You find 120mV peak-to-peak ripple at 500kHz on the 3.3V rail. What is the most likely source of this noise?
  4. Propose three hardware fixes to reduce the ADC noise to ±5 LSB

Hint: ±50 LSB = ±(50 × 3.3/4096) = ±40.3mV. The 500kHz ripple frequency is characteristic of a switching regulator’s switching frequency. Fixes: add ferrite bead + capacitor LC filter on VDDA, add a separate LDO for the ADC reference, increase decoupling (10µF + 100nF + 10nF) close to VDDA pin.

Conclusion & Next Steps

Board bring-up is where your design meets reality. By following a systematic approach — visual inspection, current-limited power-on, rail verification, firmware flash, and peripheral testing — you’ll catch issues early and build confidence in your hardware. Document everything in your bring-up log for future revisions.

Next in the Series

In Part 10: Embedded Firmware Integration, we’ll write production firmware using STM32 HAL, configure peripherals, set up FreeRTOS, and implement communication protocols.