Series Overview: This is Part 1 of our 17-part USB Development Mastery series. We journey from foundational concepts through professional USB firmware development — covering electrical signalling, enumeration, device classes, TinyUSB, debugging, RTOS integration, bare-metal USB, and hardware design.
1
USB Fundamentals
USB system architecture, transfer types, host/device model, protocol stack
You Are Here
2
Electrical & Hardware Layer
D+/D- signalling, pull-ups, connectors, USB-C, STM32 USB peripherals
3
Protocol & Enumeration
Enumeration sequence, USB packets, descriptors, endpoint concepts
4
USB Device Classes
HID, CDC, MSC, MIDI, Audio, composite devices, vendor class
5
TinyUSB Deep Dive
Stack architecture, execution model, STM32 integration, descriptor callbacks
6
CDC Virtual COM Port
CDC class, bulk transfers, printf over USB, baud rate handling
7
HID Keyboard & Mouse
HID descriptors, report format, keyboard/mouse/gamepad implementation
8
USB Mass Storage
MSC class, SCSI commands, FATFS integration, RAM disk
9
Composite Devices
Multiple classes, IAD descriptor, CDC+HID, CDC+MSC
10
Debugging USB
Wireshark capture, protocol analyser, enumeration debugging, common failures
11
RTOS + USB Integration
FreeRTOS + TinyUSB, task priorities, thread-safe communication
12
Advanced USB Topics
Host mode, OTG, isochronous, USB audio, USB video
13
Performance & Optimisation
DMA, zero-copy buffers, throughput maximisation, latency tuning
14
Custom USB Class Drivers
Vendor class, writing descriptors, OS driver interaction
15
Bare-Metal USB
Direct register programming, writing USB stack from scratch, PHY timing
16
Security in USB
BadUSB attacks, device authentication, secure firmware, USB firewall
17
USB Hardware Design
PCB layout, differential pairs, impedance matching, EMI, USB-C PD
What USB Actually Is
Ask most embedded developers what USB is and you'll get a reasonable answer about a serial protocol used to connect devices to computers. That answer is technically correct and practically useless — because USB is not a protocol in the sense that SPI or UART are protocols. USB is a complete system, and treating it as just another serial interface is exactly why USB development is so notoriously difficult.
To build USB devices that work reliably — that enumerate correctly on every OS, that survive cable hot-plug, that pass compliance testing — you need to understand all four layers of the USB system simultaneously.
The USB System Layers
USB consists of four distinct layers, each of which must be understood and implemented correctly:
| Layer |
What It Covers |
Who Implements It |
Visible To |
| Electrical |
D+/D- differential signalling, pull-up resistors, VBUS detection, signal timing |
Hardware (PHY) |
PCB designer, hardware engineer |
| Protocol |
Packets (token, data, handshake), frames, SOF (Start-of-Frame), error detection |
USB controller hardware + firmware |
Firmware developer (via USB stack) |
| Device Classes |
Standardised behaviours: HID, CDC, MSC, Audio — defines what the OS expects |
USB stack (TinyUSB, ST middleware) |
Application developer |
| OS Drivers |
Host-side software: Windows, Linux, macOS USB stacks, class drivers, user APIs |
Operating system |
Application developer, end user |
Key Insight: Most USB debugging happens at the protocol layer — but the root cause is often at the electrical layer (bad pull-up resistor value, wrong impedance) or the descriptor layer (incorrect configuration descriptor length). Understanding all four layers lets you localise bugs in minutes rather than hours.
What USB Replaces
USB was designed to unify a fragmented landscape of PC peripherals. Understanding what it replaced explains why it is the way it is:
| Legacy Interface |
Replaced By |
USB Class |
Complexity Increase |
| RS-232 UART serial port |
USB CDC |
CDC-ACM |
10× (enumeration, descriptors, class protocol) |
| PS/2 keyboard/mouse |
USB HID |
HID |
5× (report descriptors are complex) |
| Parallel port / SCSI |
USB Mass Storage |
MSC |
20× (SCSI command set + USB protocol) |
| Audio jack / line-in |
USB Audio |
UAC 1/2 |
50× (isochronous, audio class spec) |
| Proprietary device interfaces |
USB Vendor Class |
Vendor-specific |
Requires custom OS driver |
The complexity increase is not a design flaw — it's the price of universality. USB devices work without driver installation (for standard classes) on any OS. That guarantee comes from a deeply specified enumeration and descriptor system that must be implemented precisely.
Why USB Is Genuinely Hard
USB has a reputation for being disproportionately difficult compared to its ubiquity. This reputation is entirely justified, and understanding why it's hard helps you approach it without frustration:
The Spec Problem: The USB 2.0 specification is 650 pages. The USB 3.2 specification is over 1,000 pages. The USB HID specification adds another 100+ pages. The USB Audio class specification adds several hundred more. No single developer reads all of this — and that's exactly why so many USB implementations have subtle bugs.
- Asynchronous nature: USB communication is initiated by the host, not the device. Your firmware must be ready to respond to host requests at any time, with correct data, within tight timing windows.
- State machine complexity: Enumeration involves a precise sequence of requests that must be handled correctly. A single wrong response causes the host to give up silently.
- OS-specific behaviour: Windows, Linux, and macOS have subtly different expectations. A device that works on Linux may fail on Windows if descriptor details are wrong.
- No echo: Unlike UART where you can print debug output on the same interface you're debugging, USB debugging requires a second channel (UART, ITM, or a USB protocol analyser).
- Timing constraints: The USB controller expects responses within microseconds. Code that runs too slowly in an interrupt context causes timeout failures that look like enumeration errors.
USB Architecture Basics
Host ↔ Device Model
USB uses a strict host-controller / peripheral device model. This is fundamentally different from CAN bus or Ethernet, where any node can initiate communication. In USB:
- The host (typically a PC, Raspberry Pi, or STM32 in OTG host mode) controls all bus activity. It decides when each device gets to transmit. It initiates all transfers.
- The device (your STM32, RP2040, nRF52, etc.) is entirely reactive. It can never spontaneously send data to the host. It can only respond to host-initiated requests or pre-schedule data in a buffer that the host will poll.
Analogy: USB is like a call centre where the host is the supervisor who decides which agent (device) speaks and when. Agents cannot pick up the phone and call the supervisor — they wait to be connected. Even "interrupt transfers" (used by keyboards) are not device-initiated — the host polls the device at regular intervals; the device just has data waiting.
This asymmetry has profound implications for firmware design. Your USB device firmware must:
- Always have valid data ready when the host reads an endpoint (IN direction)
- Always have buffer space available when the host writes to an endpoint (OUT direction)
- Respond to control requests (Endpoint 0) within a tight timing window (typically 50 ms, but often much less in practice)
USB Topology & Hubs
USB forms a tiered star topology rooted at the host controller. The host contains a root hub — the first hub in the chain. External hubs extend the tree, with each hub adding another tier.
USB Host Controller (root hub)
├── Device A (direct connection — Tier 1)
├── Hub 1 (external hub — Tier 1)
│ ├── Device B (Tier 2)
│ ├── Device C (Tier 2)
│ └── Hub 2 (nested hub — Tier 2)
│ ├── Device D (Tier 3)
│ └── Device E (Tier 3)
└── Hub 3 (external hub — Tier 1)
└── Device F (Tier 2)
Maximum depth: 7 tiers (USB spec limit)
Maximum devices: 127 per host controller
Key topology constraints:
- Maximum 7 tiers (including root hub) — a device at tier 7 must still meet timing requirements
- Maximum 127 simultaneously addressed devices per host controller
- Hubs are USB devices themselves — they have their own descriptors and enumeration
- Bus bandwidth is shared — adding devices reduces available bandwidth per device
USB Speeds
Low Speed through SuperSpeed
| Speed |
Bit Rate |
Practical Throughput |
Max Packet Size (Bulk) |
Introduced |
Typical STM32 |
| Low Speed (LS) |
1.5 Mbit/s |
~150 KB/s |
N/A (no bulk) |
USB 1.0 (1996) |
Not typical |
| Full Speed (FS) |
12 Mbit/s |
~1.2 MB/s |
64 bytes |
USB 1.1 (1998) |
F0, F1, F3, G0, L4 (OTG_FS) |
| High Speed (HS) |
480 Mbit/s |
~40–53 MB/s |
512 bytes |
USB 2.0 (2000) |
F4, F7, H7 (OTG_HS with external PHY) |
| SuperSpeed (SS) |
5 Gbit/s |
~500 MB/s |
1024 bytes |
USB 3.0 (2008) |
Not on STM32 (H5/U5 plan USB 3.x) |
| SuperSpeed+ (SS+) |
10–40 Gbit/s |
>1 GB/s |
1024 bytes |
USB 3.1/3.2 (2013/2017) |
Not on current STM32 |
Practical note on HS: STM32F4/F7/H7 can reach High Speed, but only with an external ULPI PHY chip (e.g., USB3300). The internal FS PHY is limited to 12 Mbit/s. For most embedded applications — CDC serial, HID, MSC with <5 MB/s — Full Speed is entirely adequate.
Choosing the Right Speed
| Use Case |
Required Speed |
Reason |
| Debug serial terminal |
Full Speed |
Text data ≪ 1 MB/s |
| USB keyboard / mouse |
Full Speed (or LS) |
HID reports are tiny (8 bytes) |
| USB flash drive (FAT filesystem) |
Full Speed minimum, HS preferred |
FS gives ~1 MB/s, acceptable for small drives |
| High-speed data acquisition |
High Speed mandatory |
ADC at 1 MSPS × 2 bytes = 2 MB/s minimum |
| USB audio (96 kHz stereo 24-bit) |
Full Speed |
~576 KB/s, well within FS budget |
| USB video (1080p 30fps uncompressed) |
High Speed mandatory |
~180 MB/s — requires HS + compression |
Transfer Types — The Critical Concept
USB transfer types are the single most important concept in USB firmware design. The transfer type you choose for an endpoint determines the bandwidth guarantee, error handling behaviour, and latency characteristics of every data exchange between your device and the host. Getting this wrong produces firmware that works in testing but fails unpredictably under real conditions.
Control Transfers
Control transfers are mandatory — every USB device must support them on Endpoint 0. They are used exclusively for device configuration and management: enumeration requests, descriptor retrieval, class-specific configuration.
- Structure: SETUP stage → optional DATA stage → STATUS stage (three phases)
- Reliability: fully error-checked, retried on error
- Bandwidth: reserved — the host guarantees at least 10% of FS frame time for control
- Use case: enumeration only. Do not use control for application data — it is slow and ties up Endpoint 0
/* TinyUSB: every control request arrives here */
bool tud_vendor_control_xfer_cb(uint8_t rhport, uint8_t stage,
tusb_control_request_t const *request)
{
if (stage != CONTROL_STAGE_SETUP) return true;
switch (request->bRequest) {
case VENDOR_REQUEST_GET_VERSION:
/* Respond with firmware version string */
return tud_control_xfer(rhport, request,
&fw_version, sizeof(fw_version));
default:
return false; /* Stall unsupported requests */
}
}
Bulk Transfers
Bulk transfers are the workhorse of USB data transfer. They offer guaranteed error-free delivery (via CRC and retry) but no bandwidth guarantee — the host schedules bulk transfers in the remaining frame time after control and periodic transfers.
- Max packet size: 64 bytes (FS), 512 bytes (HS)
- Error handling: CRC16 per packet, NAK/retry, up to 3 consecutive errors before abort
- Bandwidth: best-effort — can be very fast when bus is quiet, very slow when congested
- Use cases: CDC serial data, mass storage, data loggers, oscilloscopes
Interrupt Transfers
Despite the name, interrupt transfers do not generate hardware interrupts on the host. The name comes from their scheduling: the host polls the endpoint at a guaranteed maximum interval — the polling interval — specified in the endpoint descriptor.
- Max packet size: 64 bytes (FS), 1024 bytes (HS)
- Polling interval: 1–255 ms (FS), 125 µs – 4096 ms (HS) — set in endpoint descriptor
- Bandwidth: reserved — the host allocates fixed bandwidth for the polling period
- Use cases: HID (keyboard, mouse, gamepad), CDC notification endpoint
/* TinyUSB HID: called when host polls the interrupt endpoint */
void hid_task(void)
{
/* Wait until host is ready to receive */
if (!tud_hid_ready()) return;
/* Send keyboard report if a key is pressed */
if (btn_pressed()) {
uint8_t keycode[6] = { HID_KEY_A };
tud_hid_keyboard_report(REPORT_ID_KEYBOARD, 0, keycode);
} else {
/* Send empty report to signal key release */
tud_hid_keyboard_report(REPORT_ID_KEYBOARD, 0, NULL);
}
}
Isochronous Transfers
Isochronous transfers provide guaranteed bandwidth and fixed latency but no error recovery. If a packet is corrupted, it is discarded — there is no retry. This is the only correct choice for real-time streaming data where a lost packet is better than a late one.
- Max packet size: 1023 bytes (FS), 1024 bytes (HS)
- Transfer occurs every frame (1 ms FS, 125 µs HS micro-frame)
- No handshake, no retry — data is transmitted once
- Use cases: USB audio microphone/speaker, USB video, music MIDI with strict timing
Transfer Type Comparison
| Transfer Type |
Bandwidth Guarantee |
Error Recovery |
Latency |
Max Packet (FS) |
Use Case |
| Control |
Reserved (≥10%) |
Yes (retry) |
Variable |
64 bytes |
Enumeration, config |
| Bulk |
None (best effort) |
Yes (retry) |
High |
64 bytes |
CDC, MSC, data transfer |
| Interrupt |
Polling interval |
Yes (retry) |
Bounded (≤ interval) |
64 bytes |
HID, status notification |
| Isochronous |
Fixed per frame |
None (discard) |
Fixed (1 ms FS) |
1023 bytes |
Audio, video streaming |
The USB Protocol Stack
Stack Layers Explained
USB firmware is always structured as a layered stack. Each layer has a clear responsibility, and understanding the boundaries between layers is what separates developers who understand USB from those who cargo-cult example code.
┌─────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ Your firmware logic: read sensor, blink LED │
│ Calls: tud_cdc_write(), tud_hid_report(), etc. │
├─────────────────────────────────────────────────────┤
│ CLASS DRIVER LAYER │
│ CDC, HID, MSC, Audio, MIDI class implementations │
│ Implements class-specific protocol on top of USB │
├─────────────────────────────────────────────────────┤
│ USB CORE LAYER │
│ Enumeration, descriptors, Endpoint 0 handling │
│ Transfer scheduling, frame management │
├─────────────────────────────────────────────────────┤
│ CONTROLLER DRIVER (DCD) LAYER │
│ MCU-specific: STM32 OTG_FS, RP2040 USB, nRF USB │
│ Programs USB peripheral registers, handles IRQs │
├─────────────────────────────────────────────────────┤
│ HARDWARE (PHY) │
│ D+/D- signalling, NRZI encoding, CRC, bit stuffing│
└─────────────────────────────────────────────────────┘
Where TinyUSB Fits
TinyUSB is an open-source USB device (and host) stack that implements the USB Core Layer and Class Driver Layer. It provides a clean API for your application layer and has ports for every major embedded MCU family. This is why TinyUSB is the recommended USB stack for embedded development today — it separates concerns cleanly and is much more maintainable than ST's USB middleware.
| USB Stack Option |
Pros |
Cons |
Recommended For |
| TinyUSB |
Clean API, no dynamic memory, well-tested, multi-MCU |
Requires manual integration with CubeMX |
All new projects |
| ST USB Middleware |
CubeMX integration, official support |
Monolithic, complex, harder to port |
Legacy projects, quick prototypes |
| libopencm3 |
Lightweight, open |
Limited class support |
Resource-constrained M0 devices |
| Custom / bare-metal |
Full control, zero overhead |
Massive implementation effort, error-prone |
Part 15 of this series only |
USB vs Other Protocols
Choosing USB over simpler protocols is a deliberate engineering decision. The complexity cost is real — USB adds weeks of development time. The benefits must justify that cost.
| Protocol |
Max Speed |
Host Drivers |
Hot-Plug |
Power Delivery |
Complexity |
Best For |
| UART / RS-232 |
~10 Mbit/s |
Driver required |
No |
No |
Very Low |
Debug, simple MCU-MCU |
| USB CDC |
~1 MB/s (FS) |
Built-in (all OS) |
Yes |
500 mA |
Medium |
Replacing UART to PC |
| USB HID |
~64 KB/s (FS) |
Built-in (all OS) |
Yes |
100 mA |
Medium |
Input devices, small data |
| USB MSC |
~1 MB/s (FS) |
Built-in (all OS) |
Yes |
500 mA |
High |
File transfer, data logs |
| Ethernet / TCP/IP |
10–1000 Mbit/s |
Built-in |
Yes |
PoE optional |
Very High |
Networked devices |
| Bluetooth LE |
~2 Mbit/s |
Built-in |
Wireless |
No |
High |
Wireless sensors, IoT |
Use USB when:
- You need the device to appear as a standard class (keyboard, serial port, flash drive) without the user installing drivers
- You need USB power delivery (up to 100 W with USB-C PD)
- You need hot-plug detection and automatic OS enumeration
- Your throughput requirement exceeds what UART can reliably deliver
Do not use USB when UART, SPI, or I2C between two MCUs on the same board would suffice. The complexity of USB is justified only when connecting to a general-purpose host (PC, SBC) or when the standard class benefits (no driver install) matter to the end user.
Exercises
Exercise 1
Beginner
USB Device Identification
Connect 5 different USB devices to your computer (keyboard, mouse, flash drive, microphone, phone charger). For each device: (a) identify the USB speed using Device Manager (Windows) or lsusb -v (Linux), (b) identify the device class (HID, CDC, MSC, Audio, etc.), (c) note the Vendor ID and Product ID, (d) identify what transfer type the main data endpoint uses. Document your findings in a table.
USB Enumeration
Device Classes
Transfer Types
Exercise 2
Intermediate
Capture USB Enumeration with Wireshark
Install Wireshark with USBPcap (Windows) or use the built-in USB capture (Linux: modprobe usbmon). Capture the full enumeration of a USB flash drive from plug-in to ready. Identify in the capture: (a) the GET_DESCRIPTOR request for the device descriptor, (b) the SET_ADDRESS request, (c) the GET_DESCRIPTOR for the configuration descriptor, (d) the SET_CONFIGURATION request. Note the exact sequence and the 8-byte SETUP packet format for each request.
Wireshark
Enumeration
Control Transfers
Exercise 3
Advanced
Transfer Type Bandwidth Analysis
Using a USB protocol analyser (hardware) or Wireshark on a known USB device, capture 100 ms of USB bus traffic containing a mix of HID (interrupt) and MSC (bulk) transfers. Calculate: (a) what percentage of bus bandwidth each transfer type consumes, (b) the actual vs theoretical maximum throughput for the bulk endpoint, (c) how SOF (Start-of-Frame) packets affect available bandwidth. Explain why the measured bulk throughput is lower than the theoretical 12 Mbit/s.
Protocol Analysis
Bandwidth
Bus Scheduling
USB System Assessment
Use this tool to document your USB system design — target MCU, required speed, transfer types, device classes, and USB stack selection. Download as Word, Excel, PDF, or PPTX for project documentation or design review.
Conclusion & Next Steps
In this opening article we have built the foundational mental model that every USB developer needs:
- USB is a system — not a simple protocol. It spans electrical signalling, packet-level framing, device class behaviour, and OS driver interaction simultaneously.
- The host-device model is asymmetric and non-negotiable: the host controls all bus activity; the device is entirely reactive. This drives every firmware design decision.
- The four transfer types — Control, Bulk, Interrupt, Isochronous — each offer a different combination of bandwidth guarantee, error recovery, and latency. Choosing the wrong type produces firmware that works in the lab and fails in production.
- Full Speed (12 Mbit/s) with an internal PHY covers the vast majority of embedded USB use cases — CDC, HID, MSC, composite devices. High Speed is needed only for high-throughput applications.
- TinyUSB is the recommended USB stack for new embedded projects — clean architecture, no dynamic memory, multi-MCU, and well-maintained.
Next in the Series
In Part 2: Electrical & Hardware Layer, we go deep into the physical implementation of USB: D+/D- differential signalling, NRZI encoding and bit-stuffing, pull-up resistor value selection for speed detection, cable impedance requirements, connector types, and the STM32 USB peripheral — OTG_FS vs OTG_HS, internal PHY limitations, and endpoint hardware constraints you must know before writing a single descriptor.
Related Articles in This Series
Part 2: Electrical & Hardware Layer
D+/D- differential signalling, pull-up resistors, USB-C CC pins, STM32 OTG_FS vs OTG_HS peripheral, and endpoint hardware constraints.
Read Article
Part 3: Protocol & Enumeration
The complete USB enumeration sequence, packet structure, descriptor hierarchy, and the control transfer protocol that drives device configuration.
Read Article
Part 5: TinyUSB Deep Dive
TinyUSB architecture, execution model, STM32 integration steps, descriptor callback implementation, and your first working USB device.
Read Article