Back to Technology

USB Part 5: TinyUSB Deep Dive

March 31, 2026 Wasil Zafar 22 min read

Master the TinyUSB stack — understand its architecture, board support layer, configuration options, initialization sequence, task loop design, all callback types, memory management, and how to port TinyUSB to a new MCU.

Table of Contents

  1. TinyUSB Architecture
  2. Board Support & Porting
  3. Initialization Sequence
  4. The Task Loop
  5. Device Callbacks
  6. CDC Callbacks
  7. HID Callbacks
  8. Memory Management
  9. Debugging TinyUSB
  10. Porting Checklist
  11. Practical Exercises
  12. TinyUSB Porting Plan Tool
  13. Conclusion & Next Steps
Series Context: This is Part 5 of our 17-part USB Development Mastery series. Part 4 surveyed every USB device class and how to choose the right one. Now we go deep into the TinyUSB stack itself — the code-level implementation detail you need to write production USB firmware.

USB Development Mastery

Your 17-step learning path • Currently on Step 5

TinyUSB Architecture

TinyUSB (github.com/hathach/tinyusb) is a fully open-source, MIT-licensed USB stack designed specifically for microcontrollers. It was created by Ha Thach and is now one of the most widely used USB stacks in embedded development — used by the RP2040 SDK, Adafruit libraries, and countless production firmware projects. Its design philosophy is clear and uncompromising: no dynamic memory allocation, C99 compliance, cooperative execution model, and hardware abstraction through a minimal porting interface.

The Layered Architecture

TinyUSB is organised into four distinct layers. Each layer communicates only with the layer immediately above and below it, making the stack highly testable and portable:

┌─────────────────────────────────────────────────────────────────┐
│  APPLICATION LAYER                                              │
│  Your firmware: app callbacks, data processing, business logic  │
│  tud_cdc_rx_cb(), tud_hid_get_report_cb(), etc.                 │
└─────────────────────────┬───────────────────────────────────────┘
                          │ Application callbacks (weak symbols)
┌─────────────────────────▼───────────────────────────────────────┐
│  CLASS DRIVER LAYER   (src/class/)                              │
│  cdc/  hid/  msc/  audio/  video/  dfu/  vendor/               │
│  Each class driver manages: endpoints, state machines, buffers  │
└─────────────────────────┬───────────────────────────────────────┘
                          │ usbd_* APIs
┌─────────────────────────▼───────────────────────────────────────┐
│  CORE USB LAYER       (src/device/)                             │
│  usbd.c: enumeration, control transfers, class dispatch         │
│  tusb.c: init, task scheduler                                   │
└─────────────────────────┬───────────────────────────────────────┘
                          │ dcd_* functions (Device Controller Driver)
┌─────────────────────────▼───────────────────────────────────────┐
│  HARDWARE ABSTRACTION (src/portable/)                           │
│  synopsys/  nxp/  microchip/  st/  raspberrypi/  ...           │
│  dcd_stm32_fsdev.c, dcd_dwc2.c, dcd_rp2040.c, etc.             │
└─────────────────────────────────────────────────────────────────┘

Key Source Directories

DirectoryContentsYou Modify?
src/tusb.hMain include file — include this and only this in application codeNo
src/device/usbd.cCore USB device stack — enumeration, EP0 handling, class dispatchNo
src/device/usbd_pvt.hPrivate device API for class driversNo
src/class/cdc/CDC class driver (cdc_device.c, cdc_device.h)No (configure via tusb_config.h)
src/class/hid/HID class driver (hid_device.c)No
src/class/msc/MSC class driver (msc_device.c) + SCSI command parsingNo
src/portable/Hardware DCD (Device Controller Driver) implementations per MCUOnly when porting a new MCU
hw/bsp/Board Support Packages for known boardsAdd your board here when porting
tusb_config.hYour configuration file — this is where all customisation livesYes — this is your primary config file
usb_descriptors.cYour descriptor definitions (Device, Config, HID Report, etc.)Yes — write your descriptors here

Design Principles That Shape Everything

Understanding TinyUSB's design principles prevents hours of confusion when you encounter unexpected behaviour:

  • No dynamic memory: TinyUSB never calls malloc() or free(). All buffers are statically allocated at compile time, sized by macros in tusb_config.h. This makes stack usage deterministic — critical for safety-critical applications and MISRA compliance.
  • Cooperative execution: TinyUSB is not interrupt-driven in the traditional sense. The USB interrupt handler (DCD layer) enqueues events into a ring buffer. tud_task() dequeues and processes those events. Your application calls tud_task() repeatedly — either in a super-loop or in a dedicated RTOS task.
  • Weak symbol callbacks: All application callbacks (tud_cdc_rx_cb, tud_hid_get_report_cb, etc.) are declared as weak symbols. If you don't implement them, the linker uses the default empty implementations. This allows incremental development — add callbacks only for the functionality you need.
  • C99 only: No C++ features, no compiler extensions beyond those universally supported on ARM GCC. This ensures portability across LLVM/Clang, IAR, and ARMCC as well.

Board Support & Porting

Every TinyUSB project must have a tusb_config.h file. TinyUSB locates this file by expecting it in a directory listed in your compiler's include path. This file is the single place where you configure the entire stack — which MCU, which OS, which classes, how many instances of each class, and buffer sizes.

tusb_config.h — Complete Configuration Reference

/* tusb_config.h — complete configuration example for STM32F4 + FreeRTOS CDC+HID */

#ifndef _TUSB_CONFIG_H_
#define _TUSB_CONFIG_H_

/* ─────────────────────────────────────────────────────────────────────────
   1. MCU Selection — tells TinyUSB which portable DCD driver to compile in
   ───────────────────────────────────────────────────────────────────────── */
#define CFG_TUSB_MCU   OPT_MCU_STM32F4

/* Other common values:
   OPT_MCU_RP2040        — Raspberry Pi RP2040
   OPT_MCU_STM32F0       — STM32F0 (Cortex-M0, FSDEV peripheral)
   OPT_MCU_STM32F1       — STM32F1 (FSDEV peripheral)
   OPT_MCU_STM32F2       — STM32F2 (Synopsys OTG_FS)
   OPT_MCU_STM32F7       — STM32F7 (OTG_FS or OTG_HS)
   OPT_MCU_STM32H7       — STM32H7 (OTG_FS or OTG_HS)
   OPT_MCU_NRF5X         — Nordic nRF52840
   OPT_MCU_SAMD51        — Microchip SAMD51
   OPT_MCU_ESP32S2       — Espressif ESP32-S2
   OPT_MCU_GD32VF103     — GigaDevice GD32VF103 (RISC-V)
*/

/* ─────────────────────────────────────────────────────────────────────────
   2. OS Selection — cooperative bare-metal or RTOS
   ───────────────────────────────────────────────────────────────────────── */
#define CFG_TUSB_OS    OPT_OS_FREERTOS

/* OPT_OS_NONE      — bare-metal cooperative (tud_task() in super-loop)
   OPT_OS_FREERTOS  — FreeRTOS (tud_task() in a dedicated task)
   OPT_OS_MYNEWT    — Apache Mynewt
   OPT_OS_PICO      — Raspberry Pi Pico SDK (uses pico SDK scheduler)
*/

/* ─────────────────────────────────────────────────────────────────────────
   3. Debug Level
   ───────────────────────────────────────────────────────────────────────── */
#define CFG_TUSB_DEBUG  0  /* 0=off, 1=error/warning, 2=verbose (all traffic) */

/* ─────────────────────────────────────────────────────────────────────────
   4. Memory Placement — DMA-capable SRAM section and alignment
   ───────────────────────────────────────────────────────────────────────── */
/* For STM32F4: AHB SRAM (0x20000000) is DMA-capable; DTCMRAM is NOT */
#define CFG_TUSB_MEM_SECTION  __attribute__((section(".usb_buf")))
#define CFG_TUSB_MEM_ALIGN    __attribute__((aligned(4)))

/* ─────────────────────────────────────────────────────────────────────────
   5. Device Class Configuration
   ───────────────────────────────────────────────────────────────────────── */

/* Enable CDC — set to number of CDC interfaces (usually 1 or 2) */
#define CFG_TUD_CDC              1

/* CDC TX/RX buffer sizes (bytes) — must be power of 2, ≥ 64 */
#define CFG_TUD_CDC_RX_BUFSIZE   512
#define CFG_TUD_CDC_TX_BUFSIZE   512

/* Enable HID — set to number of HID interfaces */
#define CFG_TUD_HID              1

/* HID EP buffer size (bytes) — must match wMaxPacketSize in descriptor */
#define CFG_TUD_HID_EP_BUFSIZE   64

/* Enable MSC — 0 or 1 */
#define CFG_TUD_MSC              0

/* MSC block count and size (only if CFG_TUD_MSC > 0) */
/* #define CFG_TUD_MSC_EP_BUFSIZE   512 */

/* Enable MIDI — 0 or number of instances */
#define CFG_TUD_MIDI             0

/* Enable DFU Runtime — 0 or 1 */
#define CFG_TUD_DFU_RUNTIME      0

/* Enable Vendor class — 0 or 1 */
#define CFG_TUD_VENDOR           0

/* Vendor EP buffer sizes */
/* #define CFG_TUD_VENDOR_RX_BUFSIZE 64  */
/* #define CFG_TUD_VENDOR_TX_BUFSIZE 64  */

#endif /* _TUSB_CONFIG_H_ */

The DCD Porting Interface

If you are using a supported MCU, you never need to touch the DCD layer. TinyUSB selects the correct dcd_*.c file automatically based on CFG_TUSB_MCU. However, understanding the DCD interface is essential if you are porting TinyUSB to an unsupported MCU. The DCD interface consists of these functions that you must implement:

FunctionPurpose
dcd_init()Configure USB peripheral clocks, GPIO, enable USB interrupt
dcd_int_enable()Enable USB peripheral interrupt in NVIC
dcd_int_disable()Disable USB peripheral interrupt
dcd_set_address()Set USB device address after SET_ADDRESS request
dcd_remote_wakeup()Drive remote wakeup signalling
dcd_connect()Connect USB D+ pull-up (if software-controlled)
dcd_disconnect()Disconnect USB D+ pull-up
dcd_edpt_open()Configure and open an endpoint (set type, direction, max packet size)
dcd_edpt_close_all()Close all endpoints (called on bus reset)
dcd_edpt_xfer()Start a data transfer on an endpoint
dcd_edpt_stall()Stall an endpoint
dcd_edpt_clear_stall()Clear endpoint stall

The DCD implementation calls dcd_event_*() functions to notify the core layer of hardware events — bus reset, SETUP packet received, transfer complete, SOF (Start of Frame). These events are enqueued into TinyUSB's internal event FIFO and processed by tud_task().

Initialization Sequence

The initialization sequence for TinyUSB is simple in structure but has a non-obvious ordering requirement: USB clocks must be configured before tusb_init() is called. On STM32F4 this means enabling the USB OTG_FS clock via RCC and setting up the HSE/PLL to produce the 48 MHz USB clock reference. Calling tusb_init() before the USB clock is ready produces silent failures — the DCD init code reads undefined hardware state and the peripheral may appear to start but never generates valid USB signalling.

Complete main() for STM32F4 CDC Device

/* main.c — complete TinyUSB CDC device on STM32F407 */

#include "stm32f4xx_hal.h"
#include "tusb.h"

/* ─── Clock Configuration ─────────────────────────────────────────────── */
static void SystemClock_Config(void)
{
    RCC_OscInitTypeDef RCC_OscInitStruct = {0};
    RCC_ClkInitTypeDef RCC_ClkInitStruct = {0};

    /* Enable HSE (8 MHz crystal on most Nucleo/Discovery boards) */
    RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSE;
    RCC_OscInitStruct.HSEState       = RCC_HSE_ON;
    RCC_OscInitStruct.PLL.PLLState   = RCC_PLL_ON;
    RCC_OscInitStruct.PLL.PLLSource  = RCC_PLLSOURCE_HSE;

    /* PLL config: SYSCLK = (HSE * PLLN) / (PLLM * PLLP)
       USB clock = (HSE * PLLN) / (PLLM * PLLQ) — must equal 48 MHz
       With 8 MHz HSE: PLLM=8, PLLN=336, PLLP=2 → SYSCLK=168 MHz
                       PLLQ=7 → USB=48 MHz */
    RCC_OscInitStruct.PLL.PLLM = 8;
    RCC_OscInitStruct.PLL.PLLN = 336;
    RCC_OscInitStruct.PLL.PLLP = RCC_PLLP_DIV2;
    RCC_OscInitStruct.PLL.PLLQ = 7;

    HAL_RCC_OscConfig(&RCC_OscInitStruct);

    RCC_ClkInitStruct.ClockType = RCC_CLOCKTYPE_SYSCLK | RCC_CLOCKTYPE_HCLK
                                | RCC_CLOCKTYPE_PCLK1  | RCC_CLOCKTYPE_PCLK2;
    RCC_ClkInitStruct.SYSCLKSource   = RCC_SYSCLKSOURCE_PLLCLK;
    RCC_ClkInitStruct.AHBCLKDivider  = RCC_SYSCLK_DIV1;   /* HCLK  = 168 MHz */
    RCC_ClkInitStruct.APB1CLKDivider = RCC_HCLK_DIV4;     /* PCLK1 =  42 MHz */
    RCC_ClkInitStruct.APB2CLKDivider = RCC_HCLK_DIV2;     /* PCLK2 =  84 MHz */

    HAL_RCC_ClockConfig(&RCC_ClkInitStruct, FLASH_LATENCY_5);

    /* Enable USB OTG_FS clock — MUST be done before tusb_init() */
    __HAL_RCC_USB_OTG_FS_CLK_ENABLE();
}

/* ─── GPIO for USB (PA11 = D-, PA12 = D+) ────────────────────────────── */
static void USB_GPIO_Init(void)
{
    __HAL_RCC_GPIOA_CLK_ENABLE();

    GPIO_InitTypeDef gpio = {0};
    gpio.Pin       = GPIO_PIN_11 | GPIO_PIN_12;
    gpio.Mode      = GPIO_MODE_AF_PP;
    gpio.Pull      = GPIO_NOPULL;
    gpio.Speed     = GPIO_SPEED_FREQ_VERY_HIGH;
    gpio.Alternate = GPIO_AF10_OTG_FS;
    HAL_GPIO_Init(GPIOA, &gpio);
}

int main(void)
{
    /* 1. HAL init — must be first */
    HAL_Init();

    /* 2. System clock configuration — must happen before tusb_init() */
    SystemClock_Config();

    /* 3. USB GPIO configuration */
    USB_GPIO_Init();

    /* 4. board_init() equivalent — configure status LED, debug UART, etc. */
    /* (your board-specific init here) */

    /* 5. Initialise TinyUSB — this calls dcd_init() internally */
    tusb_init();

    /* 6. Super-loop */
    while (1)
    {
        /* TinyUSB device task — must be called repeatedly */
        tud_task();

        /* Your application logic */
        cdc_task();
    }
}

/* ─── Application CDC task — called from super-loop ─────────────────── */
void cdc_task(void)
{
    /* Echo any received bytes back to host */
    if (tud_cdc_available())
    {
        uint8_t buf[64];
        uint32_t count = tud_cdc_read(buf, sizeof(buf));

        /* Echo back */
        tud_cdc_write(buf, count);
        tud_cdc_write_flush();
    }
}

tusb_init() Internals

When tusb_init() is called, it performs the following sequence internally:

  1. Calls dcd_init() — hardware USB peripheral initialisation, enables the USB clock within the peripheral, configures endpoint 0 buffer
  2. Calls dcd_connect() — asserts the D+ pull-up resistor, signalling device presence to the host
  3. Initialises the event queue (FIFO) for DCD→core event passing
  4. Initialises all enabled class drivers (CDC, HID, MSC) — each class driver sets up its static state variables and endpoint buffers
  5. Sets device state to TUSB_DEVICE_STATE_UNATTACHED
Common Mistake: Calling tusb_init() before enabling the USB peripheral clock in RCC. On STM32F4 this requires __HAL_RCC_USB_OTG_FS_CLK_ENABLE() before the TinyUSB call. The symptom is that the device never appears on the host — no enumeration starts. The DCD code silently initialises into a non-functional state when the peripheral clock is gated off.

The Task Loop

tud_task() is the heart of TinyUSB. It is the function you must call repeatedly for USB communication to work. Understanding exactly what happens inside tud_task() — and what latency requirements it imposes — is essential for correct application design.

What tud_task() Does

Every time tud_task() is called, it:

  1. Dequeues the next event from the DCD event FIFO (populated by the USB interrupt handler)
  2. Dispatches the event to the appropriate handler:
    • DCD_EVENT_BUS_RESET → resets all class drivers, sets address to 0, fires tud_umount_cb()
    • DCD_EVENT_SETUP_RECEIVED → passes SETUP packet to usbd.c for standard request handling or class dispatch
    • DCD_EVENT_XFER_COMPLETE → notifies the class driver that an IN or OUT transfer completed; class driver fires the application callback
    • DCD_EVENT_SOF → fires the SOF callback (rarely used in device applications)
  3. Returns after processing one event (or returns immediately if the queue is empty)

Critically, tud_task() processes one event per call. If events are queued faster than you call tud_task(), the event FIFO fills up and events are dropped. The FIFO depth is set by CFG_TUSB_RHPORT_EVENT_QUEUE_SIZE (default 16 in recent TinyUSB versions). For high-throughput applications, call tud_task() in a tight loop.

Bare-Metal Super-Loop Pattern

/* Bare-metal super-loop pattern — correct TinyUSB usage */
int main(void)
{
    board_init();
    tusb_init();

    while (1)
    {
        /* Call tud_task() as often as possible.
           Each call processes at most one USB event from the queue.
           For bulk data transfers, multiple calls per ms may be needed. */
        tud_task();

        /* Application-level tasks — keep these non-blocking */
        sensor_update();
        led_update();
        /* Do NOT sleep here or add delays — this starves tud_task() */
    }
}

FreeRTOS Task Pattern

When using FreeRTOS with CFG_TUSB_OS = OPT_OS_FREERTOS, TinyUSB replaces its cooperative yield with FreeRTOS task notifications. The USB interrupt handler sends a task notification to the USB task instead of just pushing to a queue. This means the USB task can block (yield) when there is no USB activity, rather than spinning and consuming CPU.

/* FreeRTOS USB task — TinyUSB with OPT_OS_FREERTOS */

#include "FreeRTOS.h"
#include "task.h"
#include "tusb.h"

/* USB device task — dedicated high-priority task */
void usb_device_task(void *param)
{
    (void) param;

    /* Initialise TinyUSB — must be called from the USB task context
       when using FreeRTOS, so that tusb_init() registers this task
       as the task to notify on USB events */
    tusb_init();

    while (1)
    {
        /* tud_task() blocks (via FreeRTOS task notification) when idle.
           It unblocks immediately when the USB interrupt fires.
           This is far more CPU-efficient than a bare-metal super-loop. */
        tud_task();
    }
    /* Never reached */
}

/* CDC application task — separate task, lower priority */
void cdc_app_task(void *param)
{
    (void) param;

    while (1)
    {
        /* Check for received CDC data — non-blocking check */
        if (tud_cdc_available())
        {
            uint8_t buf[256];
            uint32_t count = tud_cdc_read(buf, sizeof(buf));
            /* Process received data */
            process_command(buf, count);
        }
        /* Yield every 10 ms if nothing to do */
        vTaskDelay(pdMS_TO_TICKS(10));
    }
}

int main(void)
{
    SystemClock_Config();
    board_init();

    /* USB task: highest priority among application tasks — priority 5 */
    xTaskCreate(usb_device_task, "usbd", 3072, NULL, 5, NULL);

    /* CDC application task: lower priority — priority 2 */
    xTaskCreate(cdc_app_task,    "cdc",  1024, NULL, 2, NULL);

    vTaskStartScheduler();
    /* Never reached */
    return 0;
}
Priority Rule: The USB task must be the highest-priority task in your system (or at least higher than any task that may call USB APIs). If a lower-priority task holds a mutex while a higher-priority task tries to transmit USB data, the USB task may be starved. TinyUSB's FreeRTOS integration uses a critical section (task notification) that only works correctly when the USB task has the highest effective priority. If USB enumeration fails only when other tasks are running, incorrect task priority is the first thing to check.

Device Callbacks

TinyUSB notifies your application of USB lifecycle events through device-level callbacks. These are declared as weak symbols in TinyUSB — you implement only the ones you need. Each callback is called from within tud_task(), not from the interrupt handler directly.

Lifecycle Callbacks

/* usb_callbacks.c — implement device-level TinyUSB callbacks */

#include "tusb.h"
#include "led.h"   /* hypothetical LED status driver */

/* ─── tud_mount_cb ──────────────────────────────────────────────────────
   Called when the device has successfully enumerated and the host has
   set a configuration. At this point the device is fully operational:
   all endpoints are configured, class drivers are ready.
   ─────────────────────────────────────────────────────────────────────── */
void tud_mount_cb(void)
{
    led_set_status(LED_USB_CONNECTED);

    /* Safe to start sending data now */
    /* Do NOT attempt tud_cdc_write() before tud_mount_cb() fires */
}

/* ─── tud_umount_cb ─────────────────────────────────────────────────────
   Called when the device has been unmounted (cable unplugged, host USB
   reset, or host driver unloaded the device). All endpoint transfers
   that were in progress have been cancelled.
   ─────────────────────────────────────────────────────────────────────── */
void tud_umount_cb(void)
{
    led_set_status(LED_USB_DISCONNECTED);

    /* Stop any ongoing data transmission — the endpoints are closed */
    /* Pending tud_cdc_write() data is discarded */
}

/* ─── tud_suspend_cb ────────────────────────────────────────────────────
   Called when the USB bus has been idle for > 3 ms — the host has
   stopped generating SOF tokens, indicating USB Suspend state.
   The device must reduce its bus current draw to ≤ 2.5 mA within 7 ms.
   'remote_wakeup_en' is true if the host has enabled remote wakeup.
   ─────────────────────────────────────────────────────────────────────── */
void tud_suspend_cb(bool remote_wakeup_en)
{
    (void) remote_wakeup_en;

    /* Enter low-power mode — reduce system clock, gate peripherals */
    led_set_status(LED_USB_SUSPENDED);

    /* If remote_wakeup_en, you can wake the host by calling
       tud_remote_wakeup() when the user presses a button */
}

/* ─── tud_resume_cb ─────────────────────────────────────────────────────
   Called when the USB bus resumes from Suspend state — the host has
   started generating SOF tokens again.
   ─────────────────────────────────────────────────────────────────────── */
void tud_resume_cb(void)
{
    /* Restore normal operating frequency and peripheral state */
    led_set_status(LED_USB_CONNECTED);
}

/* ─── tud_vendor_control_xfer_cb ────────────────────────────────────────
   Called when a vendor-specific control request arrives on EP0.
   bmRequestType has Vendor type set (bits [6:5] = 0b10).
   Must return true if the request was handled, false to STALL EP0.
   ─────────────────────────────────────────────────────────────────────── */
bool tud_vendor_control_xfer_cb(uint8_t rhport, uint8_t stage,
                                 tusb_control_request_t const *request)
{
    if (stage != CONTROL_STAGE_SETUP) return true; /* ACK setup only */

    switch (request->bRequest)
    {
        case 0x01:  /* Custom command: set LED colour */
            if (request->bmRequestType_bit.direction == TUSB_DIR_OUT)
            {
                /* For OUT requests, data arrives in DATA stage — return true
                   to accept; TinyUSB calls this callback again with
                   stage == CONTROL_STAGE_DATA when data is ready */
                return tud_control_xfer(rhport, request, NULL, 0);
            }
            return false;

        case 0x02:  /* Custom command: get firmware version */
        {
            static const uint8_t fw_ver[4] = {0x01, 0x02, 0x00, 0x00};
            return tud_control_xfer(rhport, request,
                                    (void*)fw_ver, sizeof(fw_ver));
        }

        default:
            return false;  /* Stall unknown vendor requests */
    }
}

CDC Callbacks in TinyUSB

CDC callbacks are the interface between TinyUSB's CDC class driver and your application. The CDC driver handles all the USB-level framing — CBW/CSW for control, bulk transfers for data. Your callbacks deal only with the data bytes and line-state events.

Complete CDC Echo with Flow Control

/* cdc_device_callbacks.c — production-quality CDC echo with flow control */

#include "tusb.h"
#include <string.h>

/* ─── tud_cdc_rx_cb ─────────────────────────────────────────────────────
   Called when the CDC driver has received data from the host and stored
   it in the RX ring buffer. 'itf' is the CDC interface index (0 for the
   first CDC interface).
   ─────────────────────────────────────────────────────────────────────── */
void tud_cdc_rx_cb(uint8_t itf)
{
    (void) itf;

    /* Read all available bytes — tud_cdc_available() returns the byte count */
    while (tud_cdc_available())
    {
        uint8_t buf[64];
        uint32_t count = tud_cdc_read(buf, sizeof(buf));

        /* Echo bytes back — tud_cdc_write() copies into TX ring buffer */
        uint32_t written = tud_cdc_write(buf, count);

        if (written < count)
        {
            /* TX buffer full — in production: implement backpressure or
               drop data depending on your protocol requirements */
        }
    }

    /* Flush: initiate the IN transfer to the host.
       Without this call, data sits in the TX buffer until the buffer fills
       or a timeout triggers an automatic flush (implementation-defined).
       Always call tud_cdc_write_flush() after writing to guarantee
       the data reaches the host promptly. */
    tud_cdc_write_flush();
}

/* ─── tud_cdc_tx_complete_cb ────────────────────────────────────────────
   Called when a CDC IN transfer has completed — the host has received
   the data that was in the TX buffer. Use this to start the next write
   in a ping-pong or streaming scenario.
   ─────────────────────────────────────────────────────────────────────── */
void tud_cdc_tx_complete_cb(uint8_t itf)
{
    (void) itf;
    /* Signal application layer that TX buffer space is available */
    /* In a streaming scenario: enqueue the next chunk of data here */
}

/* ─── tud_cdc_line_coding_cb ────────────────────────────────────────────
   Called when the host sends a SET_LINE_CODING request — the host is
   configuring the virtual serial port baud rate, stop bits, parity.
   For a pure USB CDC device, you can ignore these settings entirely — the
   USB transfers at USB speed regardless of the reported baud rate.
   Only relevant if your CDC device bridges to a physical UART.
   ─────────────────────────────────────────────────────────────────────── */
void tud_cdc_line_coding_cb(uint8_t itf,
                             cdc_line_coding_t const *p_line_coding)
{
    (void) itf;

    /* p_line_coding->bit_rate  — baud rate (e.g., 115200) */
    /* p_line_coding->stop_bits — 0=1bit, 1=1.5bit, 2=2bit */
    /* p_line_coding->parity    — 0=none, 1=odd, 2=even, 3=mark, 4=space */
    /* p_line_coding->data_bits — 5, 6, 7, 8, or 16 */

    /* Example: bridge to UART — reconfigure UART with new settings */
    if (p_line_coding->bit_rate != 0)
    {
        UART_Reconfigure(p_line_coding->bit_rate,
                         p_line_coding->stop_bits,
                         p_line_coding->parity,
                         p_line_coding->data_bits);
    }
}

/* ─── tud_cdc_line_state_cb ─────────────────────────────────────────────
   Called when the host sends SET_CONTROL_LINE_STATE — RTS and DTR signals.
   DTR (dtr=true) means the host serial application has opened the COM port.
   RTS controls flow (if hardware flow control is in use).
   ─────────────────────────────────────────────────────────────────────── */
void tud_cdc_line_state_cb(uint8_t itf, bool dtr, bool rts)
{
    (void) itf;
    (void) rts;

    /* DTR transition is the most commonly used signal:
       dtr=true  — host has opened the serial port (terminal connected)
       dtr=false — host has closed the serial port */
    if (dtr)
    {
        /* Terminal opened — send welcome message */
        const char *msg = "\r\nUSB CDC Device Ready\r\n> ";
        tud_cdc_write_str(msg);
        tud_cdc_write_flush();
    }
}

HID Callbacks in TinyUSB

HID callbacks in TinyUSB are slightly different from CDC callbacks in that they are request-driven rather than data-driven. The HID GET_REPORT request (and the periodic IN transfer that TinyUSB manages automatically) ask your firmware to provide the current HID report. You provide it by returning data from the callback or by calling tud_hid_report() from your task loop.

HID Report Descriptor Macros

TinyUSB provides a comprehensive set of macros in class/hid/hid.h for building HID Report Descriptors. These macros correspond directly to the HID items described in Part 4:

/* usb_descriptors.c — HID Report Descriptor using TinyUSB macros */

#include "tusb.h"

/* Keyboard Report Descriptor using TinyUSB macros */
uint8_t const desc_hid_report[] =
{
    TUD_HID_REPORT_DESC_KEYBOARD(HID_REPORT_ID(REPORT_ID_KEYBOARD))
};

/* Mouse Report Descriptor — provided by TinyUSB as a complete macro */
/* uint8_t const desc_hid_report[] = {
    TUD_HID_REPORT_DESC_MOUSE(HID_REPORT_ID(REPORT_ID_MOUSE))
}; */

/* Custom 64-byte generic HID report descriptor */
/* uint8_t const desc_hid_report[] = {
    HID_USAGE_PAGE (HID_USAGE_PAGE_VENDOR),
    HID_USAGE      (0x01),
    HID_COLLECTION (HID_COLLECTION_APPLICATION),
        HID_REPORT_ID (REPORT_ID_CUSTOM),
        HID_USAGE      (0x02),
        HID_LOGICAL_MIN  (0),
        HID_LOGICAL_MAX  (255),
        HID_REPORT_SIZE  (8),
        HID_REPORT_COUNT (63),
        HID_INPUT  (HID_DATA | HID_VARIABLE | HID_ABSOLUTE),
        HID_USAGE      (0x03),
        HID_REPORT_COUNT (63),
        HID_OUTPUT (HID_DATA | HID_VARIABLE | HID_ABSOLUTE),
    HID_COLLECTION_END
}; */

HID Callbacks Implementation

/* hid_callbacks.c — TinyUSB HID device callbacks */

#include "tusb.h"
#include <string.h>

/* Report IDs — must match the Report Descriptor */
enum {
    REPORT_ID_KEYBOARD = 1,
    REPORT_ID_MOUSE    = 2,
};

/* ─── tud_hid_get_report_cb ─────────────────────────────────────────────
   Called when the host sends a HID GET_REPORT control request.
   This happens at enumeration (to verify the device) and occasionally
   at runtime. Return the report data via the provided buffer.
   'report_type' is HID_REPORT_TYPE_INPUT, _OUTPUT, or _FEATURE.
   Return 0 to STALL (report not available); return report length on success.
   ─────────────────────────────────────────────────────────────────────── */
uint16_t tud_hid_get_report_cb(uint8_t itf, uint8_t report_id,
                                hid_report_type_t report_type,
                                uint8_t *buffer, uint16_t reqlen)
{
    (void) itf;
    (void) reqlen;

    if (report_id == REPORT_ID_KEYBOARD &&
        report_type == HID_REPORT_TYPE_INPUT)
    {
        /* Return current keyboard state — 8 bytes:
           [modifier][reserved][keycode0..keycode5] */
        hid_keyboard_report_t report = {0};
        /* Populate from your keyboard matrix scan here */
        memcpy(buffer, &report, sizeof(report));
        return sizeof(report);
    }

    return 0;  /* Stall unknown report types */
}

/* ─── tud_hid_set_report_cb ─────────────────────────────────────────────
   Called when the host sends a HID SET_REPORT control request or
   writes to the HID OUT endpoint. Used for LED output reports
   (keyboard Num/Caps/Scroll Lock), haptic feedback, actuator control.
   ─────────────────────────────────────────────────────────────────────── */
void tud_hid_set_report_cb(uint8_t itf, uint8_t report_id,
                            hid_report_type_t report_type,
                            uint8_t const *buffer, uint16_t bufsize)
{
    (void) itf;
    (void) bufsize;

    if (report_id == REPORT_ID_KEYBOARD &&
        report_type == HID_REPORT_TYPE_OUTPUT)
    {
        /* Keyboard LED output report — first byte contains LED bits:
           Bit 0: Num Lock LED
           Bit 1: Caps Lock LED
           Bit 2: Scroll Lock LED */
        uint8_t kbd_leds = buffer[0];
        LED_SetNumLock   ((kbd_leds >> 0) & 1);
        LED_SetCapsLock  ((kbd_leds >> 1) & 1);
        LED_SetScrollLock((kbd_leds >> 2) & 1);
    }
}

/* ─── Sending reports from the task loop ────────────────────────────────
   In addition to the request-driven callbacks above, you can send HID
   reports proactively from your application task using tud_hid_report()
   or the class-specific helpers.
   ─────────────────────────────────────────────────────────────────────── */
void keyboard_task(void)
{
    /* Only send if device is mounted and previous report was sent */
    if (!tud_hid_ready()) return;

    /* Example: send 'A' key press */
    static bool key_pressed = false;

    if (!key_pressed)
    {
        /* Press 'A' — keycode 0x04 */
        tud_hid_keyboard_report(REPORT_ID_KEYBOARD,
                                 0,          /* modifier: none */
                                 (uint8_t[]){0x04, 0, 0, 0, 0, 0});
        key_pressed = true;
    }
    else
    {
        /* Release all keys — send empty report */
        tud_hid_keyboard_report(REPORT_ID_KEYBOARD, 0, NULL);
        key_pressed = false;
    }
}

void mouse_task(void)
{
    if (!tud_hid_ready()) return;

    /* Move mouse +5 pixels in X, 0 in Y, no buttons, no scroll */
    tud_hid_mouse_report(REPORT_ID_MOUSE,
                          0,     /* buttons: none */
                          5,     /* x delta: +5 pixels */
                          0,     /* y delta */
                          0,     /* vertical scroll */
                          0);    /* horizontal scroll */
}

Memory Management

Memory management is where TinyUSB projects on STM32 most commonly fail silently. The root cause is always the same: USB endpoint buffers placed in SRAM that the USB DMA engine cannot access. Understanding the SRAM topology of your specific STM32 is not optional — it is a prerequisite for correct USB operation.

The STM32 DTCMRAM Problem

STM32F7 and STM32H7 devices have DTCM (Data Tightly Coupled Memory) — a very fast zero-wait-state SRAM intended for performance-critical data. However, DTCMRAM is connected directly to the Cortex-M7 core, not to the AHB bus matrix. This means the USB DMA controller cannot access DTCMRAM. If TinyUSB's endpoint buffers land in DTCMRAM, USB transfers silently fail — the DMA writes to the buffer but the CPU reads zeros, or vice versa.

STM32 FamilyUSB-Capable SRAMDMA-Incapable SRAMDefault Stack Location
STM32F4AHB SRAM (0x20000000)CCM SRAM (0x10000000) — no DMAUsually AHB SRAM (safe)
STM32F7AXI SRAM (0x20000000)DTCMRAM (0x20000000 on some variants) — no DMADepends on linker script
STM32H7AXI SRAM D1 (0x24000000)DTCMRAM (0x20000000) — no DMAOften DTCMRAM — UNSAFE for USB
RP2040All SRAM is DMA-capableN/ASafe
nRF52840All RAM (0x20000000)N/ASafe

CFG_TUSB_MEM_SECTION and CFG_TUSB_MEM_ALIGN

TinyUSB applies CFG_TUSB_MEM_SECTION and CFG_TUSB_MEM_ALIGN to all internal endpoint buffers. You must set these macros to place buffers in the correct SRAM region:

/* tusb_config.h — memory placement for STM32H7 */

/* On STM32H7, AXI SRAM D1 at 0x24000000 is USB DMA-capable.
   DTCMRAM at 0x20000000 is NOT USB DMA-capable.
   The default linker script may place .bss in DTCMRAM on H7.
   We must force USB buffers into AXI SRAM. */

#define CFG_TUSB_MEM_SECTION  __attribute__((section(".usb_buf")))
#define CFG_TUSB_MEM_ALIGN    __attribute__((aligned(4)))

/*
   In your linker script (STM32H7xx_FLASH.ld), add a section for USB buffers
   that maps to AXI SRAM:

   MEMORY {
     AXISRAM   (xrw)  : ORIGIN = 0x24000000, LENGTH = 512K
     DTCMRAM   (xrw)  : ORIGIN = 0x20000000, LENGTH = 128K
     ...
   }

   SECTIONS {
     .usb_buf (NOLOAD) : {
       . = ALIGN(4);
       *(.usb_buf)
       *(.usb_buf*)
       . = ALIGN(4);
     } >AXISRAM

     .bss : {
       ...
     } >DTCMRAM
   }
*/

Endpoint Buffer Placement Example

/* For your own application buffers that exchange data with TinyUSB,
   apply the same section attribute to ensure DMA-safe placement */

CFG_TUSB_MEM_SECTION CFG_TUSB_MEM_ALIGN
static uint8_t usb_rx_buf[512];  /* DMA-safe RX scratch buffer */

CFG_TUSB_MEM_SECTION CFG_TUSB_MEM_ALIGN
static uint8_t usb_tx_buf[512];  /* DMA-safe TX scratch buffer */

/* Without these attributes on STM32H7, if the linker places these
   buffers in DTCMRAM, tud_cdc_read() will silently return garbage
   and tud_cdc_write() will appear to succeed but the host receives
   no data — one of the most confusing USB bugs to diagnose. */

Alignment Requirements

USB endpoint buffers typically require 4-byte alignment for the DMA engine. The Synopsys OTG IP (used in STM32F2/F4/F7/H7) requires endpoint descriptors and data buffers to be 4-byte aligned. Some MCUs (RP2040 native USB) have no alignment requirement but TinyUSB applies 4-byte alignment universally for safety. The CFG_TUSB_MEM_ALIGN macro handles this — do not remove or reduce it.

Debugging TinyUSB

TinyUSB's built-in debug output is invaluable for diagnosing enumeration failures and unexpected behaviour. Enabled by setting CFG_TUSB_DEBUG to 1 or 2, it outputs structured log messages over a UART or ITM SWO channel.

Debug Levels

CFG_TUSB_DEBUGOutputUse When
0 No output Production firmware — zero overhead
1 Errors and warnings only. Reports assert failures, stalled endpoints, and class driver errors. Development — low noise, catches real problems
2 Full verbose trace. Logs every USB event, every control request, every endpoint transfer start and completion. Produces enormous output — typically thousands of lines per second during active USB traffic. Debugging enumeration failures, investigating unexpected behaviour

Configuring Debug Output

TinyUSB debug output uses tu_printf which maps to printf. You must implement a UART-backed printf using the retarget mechanism:

/* uart_retarget.c — redirect printf to UART1 for TinyUSB debug output */

#include "stm32f4xx_hal.h"
#include <stdio.h>

extern UART_HandleTypeDef huart1;  /* configured for 115200 8N1 */

/* Retarget _write() for GCC newlib — printf calls this */
int _write(int fd, char *ptr, int len)
{
    (void) fd;
    HAL_UART_Transmit(&huart1, (uint8_t*)ptr, (uint16_t)len, HAL_MAX_DELAY);
    return len;
}

/*
   tusb_config.h additions for debug:

   #define CFG_TUSB_DEBUG  2

   With level 2, expect output like:
   [100] USBD init
   [105] DCD init rhport=0
   [200] BUS RESET
   [250] Speed = Full Speed
   [300] SETUP bmRequestType=0x80 bRequest=0x06 wValue=0x0100 wIndex=0x0000 wLength=18
   [310] Get Device Descriptor
   [350] SETUP complete status=0
   [400] SET_ADDRESS = 5
   ...
   [800] Mount callback
*/

Key Status Query Functions

Before sending data or trusting callbacks, use TinyUSB's state query functions:

/* State query functions — use these to gate USB operations */

/* tud_connected(): true if VBUS is present and device is not in suspend.
   Returns true from the moment the cable is plugged in. */
if (!tud_connected()) {
    /* Do not attempt any USB operations — bus is not powered */
    return;
}

/* tud_mounted(): true only after enumeration is complete AND the host
   has sent SET_CONFIGURATION. This is the state where class drivers
   are active and tud_mount_cb() has been called. */
if (!tud_mounted()) {
    /* Device is connected but not yet configured — enumeration in progress */
    return;
}

/* tud_suspended(): true when the bus is suspended (no SOF for > 3 ms) */
if (tud_suspended()) {
    /* Consider calling tud_remote_wakeup() if user action requires it */
}

/* tud_cdc_connected(): CDC-specific — true when DTR is asserted by host
   i.e., a terminal application has opened the COM port */
if (!tud_cdc_connected()) {
    /* CDC data interface is ready but no host application is listening */
    /* Writing to CDC at this point will buffer data or be silently dropped */
}

/* tud_hid_ready(): HID-specific — true when the previous HID IN report
   has been sent and the endpoint is ready for the next report */
if (!tud_hid_ready()) {
    /* Wait before sending next HID report to avoid overwriting in-flight data */
}

Common Assert Failures and Their Causes

Assert / SymptomRoot CauseFix
Device never enumerates (no OS recognition)USB clock not enabled before tusb_init()Add __HAL_RCC_USB_OTG_FS_CLK_ENABLE() before tusb_init()
Enumeration starts then fails at SET_CONFIGURATIONConfiguration descriptor wTotalLength incorrect (too small or too large)Recount all descriptor bytes; use sizeof() on the full array
CDC appears but no data received on hosttud_cdc_write_flush() not called after tud_cdc_write()Always call tud_cdc_write_flush() after writing
HID device enumerates but no reports receivedtud_hid_ready() not checked before tud_hid_report()Check tud_hid_ready() and only send when true
USB works on RP2040 but not STM32H7Endpoint buffers in DTCMRAM (DMA-inaccessible)Add CFG_TUSB_MEM_SECTION pointing to AXI SRAM
Sporadic corruption of received bulk dataBuffer not 4-byte alignedApply CFG_TUSB_MEM_ALIGN to all USB-facing buffers

Porting Checklist for New STM32 Targets

When bringing up TinyUSB on a new STM32 target (or any new MCU), work through this checklist sequentially. Each step verifies the foundation for the next. Skipping steps leads to symptoms that masquerade as stack bugs but are actually hardware configuration issues.

Step Action Verification Files to Modify
1 Enable HSE and configure PLL to produce correct SYSCLK and 48 MHz USB clock Verify with oscilloscope on MCO pin; measure SYSCLK with SysTick timer SystemClock_Config()
2 Enable USB peripheral clock in RCC (__HAL_RCC_USB_OTG_FS_CLK_ENABLE()) Read back RCC register to verify bit is set main.c or board.c
3 Configure USB GPIO pins (D+/D- alternate function, no pull-up/pull-down) Measure 3.3 V on VBUS via voltmeter; confirm D+ pull-up is not asserted before tusb_init() USB_GPIO_Init()
4 Set CFG_TUSB_MCU in tusb_config.h to the correct MCU OPT value Compile succeeds; correct dcd_*.c file is included in build tusb_config.h
5 Set CFG_TUSB_MEM_SECTION and CFG_TUSB_MEM_ALIGN for DMA-safe SRAM Inspect .map file; verify USB buffer symbols are in correct SRAM region tusb_config.h, linker script
6 Call tusb_init() after clock and GPIO init USB device appears as "Unknown Device" in Device Manager — enumeration has started main.c
7 Write minimal descriptor set (Device + Config + Interface + Endpoint) in usb_descriptors.c Device enumerates without errors; Windows shows device class correctly usb_descriptors.c
8 Implement tud_mount_cb() and verify it is called LED or UART message confirms mount callback fires after enumeration usb_callbacks.c
9 Enable first class (CDC or HID) and verify basic data transfer For CDC: receive echo in terminal. For HID: see keystrokes in Notepad. tusb_config.h, class callbacks
10 Enable CFG_TUSB_DEBUG 2 and capture full enumeration trace on UART All descriptors returned correctly; no STALL responses on EP0 tusb_config.h, UART retarget

Practical Exercises

These exercises are designed to build proficiency in TinyUSB incrementally. Complete them in order — each exercise validates the understanding required for the next.

Tier 1 — Conceptual (No Hardware Required)

  1. Architecture Mapping: Download the TinyUSB repository and locate the following files: src/device/usbd.c, src/class/cdc/cdc_device.c, src/portable/synopsys/dwc2/dcd_dwc2.c. For each file, identify: (a) which architecture layer it belongs to, (b) the key function it exports to the layer above, and (c) the key function it calls in the layer below.
  2. Configuration Analysis: Given a project requiring: CDC-ACM with 1 KB TX/RX buffers, one HID interface with 64-byte reports at 1 ms polling on an STM32F411 running FreeRTOS — write the complete tusb_config.h from memory, then verify against the reference in this article.
  3. Memory Section Exercise: Using the STM32H743 reference manual, identify: (a) which SRAM regions are accessible to the USB DMA, (b) the default linker script region for .bss, (c) what changes are required in the linker script to place a section named .usb_buf in AXI SRAM D1.

Tier 2 — Firmware (Hardware Required)

  1. STM32 USB Bring-Up: Starting from a blank STM32F4 project (no CubeMX USB middleware), integrate TinyUSB manually: add the source files to your build system, create tusb_config.h for CDC, write usb_descriptors.c, and achieve successful enumeration verified by lsusb on Linux or Device Manager on Windows. The device must appear as a CDC-ACM serial port without any driver installation.
  2. CDC Echo with Line State: Extend the CDC bring-up with proper tud_cdc_line_state_cb() implementation. When the host opens the COM port (DTR asserts), the device sends a welcome banner. Implement full duplex echo with tud_cdc_rx_cb(). Test with PuTTY, screen, and pyserial to verify correct behaviour with each tool's DTR handling.
  3. HID Keyboard: Add an HID keyboard interface alongside the CDC interface (composite device). When PA0 button is pressed, send a keyboard report with keycode 0x04 ('A'). When released, send an empty report. Verify that pressing the button types 'A' in a text editor while the CDC serial port remains operational simultaneously.

Tier 3 — Advanced

  1. FreeRTOS Thread-Safe CDC: Implement a TinyUSB CDC device with FreeRTOS where three separate tasks write to the CDC interface concurrently: a sensor task (25 Hz), a command processor task (event-driven), and a heartbeat task (1 Hz). Add a FreeRTOS mutex to protect tud_cdc_write() / tud_cdc_write_flush() calls. Demonstrate that output from all three tasks is correctly interleaved and no data is corrupted.
  2. Custom DCD Port: Take an unsupported STM32 variant (e.g., STM32G0 which uses the FSDEV peripheral but may not have a TinyUSB BSP). Create a new BSP entry in hw/bsp/ with a board.h that defines clock and GPIO configuration for your board. Add the board to the TinyUSB build system. Achieve CDC enumeration on this previously unsupported target.

TinyUSB Porting Plan Tool

Document your TinyUSB porting plan — target MCU, OS selection, enabled classes, memory section configuration, debug level, and implementation notes. Download as Word, Excel, PDF, or PPTX for team review or project documentation.

TinyUSB Porting Plan Generator

Document your TinyUSB integration plan. Download as Word, Excel, PDF, or PPTX.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Conclusion & Next Steps

You now have a complete working knowledge of the TinyUSB stack — from source tree layout to production-grade memory management. The key lessons from this deep dive:

  • TinyUSB's layered architecture means you interact with it at exactly one level — the application callback layer — while the class drivers, core USB layer, and DCD handle everything below. Resist the urge to modify class driver source; configure through tusb_config.h instead.
  • tusb_config.h is your single configuration point. CFG_TUSB_MCU, CFG_TUSB_OS, CFG_TUD_CDC, CFG_TUD_HID, and the memory macros are the levers that shape the entire stack's behaviour and resource usage.
  • Clock setup must precede tusb_init(). This non-obvious ordering requirement causes the most common bring-up failure. Always enable the USB peripheral clock before calling tusb_init().
  • tud_task() must be called continuously. In a bare-metal super-loop, this means it is the dominant activity. In FreeRTOS with OPT_OS_FREERTOS, the USB task blocks efficiently when idle — but must be the highest-priority task to prevent enumeration failures under load.
  • Memory section placement is non-optional on STM32F7/H7. USB buffers in DTCMRAM produce silent DMA failures. Configure CFG_TUSB_MEM_SECTION and update your linker script to target AXI SRAM or SRAM1/D2 depending on your specific device.
  • CFG_TUSB_DEBUG 2 is your primary debugging tool. Enable it at the first sign of enumeration trouble. The verbose trace makes descriptor errors immediately visible.

Next in the Series

In Part 6: CDC Virtual COM Port, we put TinyUSB's CDC driver to practical use: implementing a full virtual serial port with proper baud rate bridging to a physical UART, printf redirection over USB, flow control using DTR/RTS signals, multi-packet streaming for high-throughput ADC data, and a complete command-line interface received over CDC with command parsing and response generation — the foundation of nearly every embedded USB debug interface.

Technology