130 lines
5.7 KiB
Markdown
130 lines
5.7 KiB
Markdown
# Audio Analysis on Zephyr - Nucleo G474RE
|
|
|
|
A project I built to learn Zephyr RTOS by doing something more involved than blinking an LED. It samples audio via ADC at 44.1 kHz using hardware timer triggering and DMA, runs a real-time FFT with CMSIS-DSP, and streams results over UART to a Python live plotter.
|
|
|
|
I've used FreeRTOS before and wanted to understand how Zephyr handles devicetree, Kconfig, and the kernel primitives. This project ended up touching all of those plus direct STM32 LL/HAL register work, which was a good way to see where Zephyr's abstractions end and the hardware begins.
|
|
|
|
---
|
|
|
|
## Hardware
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Board** | ST Nucleo G474RE |
|
|
| **MCU** | STM32G474RE (Cortex-M4F, 170 MHz, FPU) |
|
|
| **Audio Input** | Analog signal on PA0 (Arduino A0) |
|
|
| **Console** | LPUART1 via onboard ST-Link VCP (PA2/PA3) |
|
|
| **ADC Trigger** | TIM6 TRGO at 44,098.6 Hz |
|
|
|
|
For testing I used a waveform generator feeding a sine wave into PA0. Any 0-3.3V analog source works.
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
TIM6 overflows at ~44.1 kHz and triggers an ADC1 conversion via hardware TRGO. DMA transfers each sample into a ping-pong buffer (2 x 1024 samples). On half-transfer and transfer-complete interrupts, a semaphore wakes the processing thread, which runs a 1024-point real FFT using CMSIS-DSP and sends the results over UART.
|
|
|
|
```
|
|
TIM6 (44.1 kHz) -> ADC1 conversion -> DMA -> ping-pong buffer
|
|
|
|
|
DMA half/full IRQ
|
|
|
|
|
proc_thread wakes
|
|
|
|
|
FFT + RMS + peak detect
|
|
|
|
|
UART output
|
|
```
|
|
|
|
The CPU spends about 0% of its time on processing (verified with Zephyr's thread analyzer). 98% idle. The Cortex-M4F at 170 MHz handles a 1024-point float FFT in well under a millisecond.
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
audio_analysis/
|
|
├── src/
|
|
│ ├── main.c # thread setup, init sequence
|
|
│ ├── audio_capture.c/.h # TIM6 + ADC1 + DMA (STM32 LL)
|
|
│ ├── audio_process.c/.h # CMSIS-DSP FFT, RMS, peak detection
|
|
│ └── audio_output.c/.h # UART formatting (summary, CSV, raw)
|
|
├── boards/
|
|
│ └── nucleo_g474re.overlay # ADC channel + TIM6 devicetree config
|
|
├── prj.conf # Kconfig
|
|
├── plotter.py # Python live plotter (matplotlib + pyserial)
|
|
└── serial_debug.py # Quick serial diagnostic tool
|
|
```
|
|
|
|
---
|
|
|
|
## Build and Flash
|
|
|
|
```bash
|
|
west build -p
|
|
west flash
|
|
```
|
|
|
|
Board is set in CMakeLists.txt so no `-b` needed.
|
|
|
|
## Serial Monitor
|
|
|
|
Open a terminal on the ST-Link VCP at 115200 baud. Output looks like:
|
|
|
|
```
|
|
RAW:2048,2100,2200,...
|
|
FFT:0.12,0.45,3.21,...
|
|
RMS: 0.3412 | Peak: 1000.0 Hz (bin 23)
|
|
Min: -0.8123 | Max: 0.7945
|
|
```
|
|
|
|
With the thread analyzer, the output looks like:
|
|
|
|
```
|
|
Thread analyze:
|
|
0x20000150 : STACK: unused 3608 usage 488 / 4096 (11 %); CPU: 0 %
|
|
: Total CPU cycles used: 200757869
|
|
thread_analyzer : STACK: unused 512 usage 512 / 1024 (50 %); CPU: 1 %
|
|
: Total CPU cycles used: 1373467186
|
|
sysworkq : STACK: unused 808 usage 216 / 1024 (21 %); CPU: 0 %
|
|
: Total CPU cycles used: 1446
|
|
idle : STACK: unused 320 usage 64 / 384 (16 %); CPU: 98 %
|
|
: Total CPU cycles used: 129000604158
|
|
ISR0 : STACK: unused 1832 usage 216 / 2048 (10 %)
|
|
```
|
|
|
|
Very useful for seeing how my threads are behaving, if I am over allocating stack sizing, or bottlenecks.
|
|
|
|
## Python Plotter
|
|
|
|
```bash
|
|
pip install pyserial matplotlib numpy PyQt5
|
|
python plotter.py COM5
|
|
```
|
|
|
|
Shows a live time-domain waveform and frequency spectrum. Data is decimated (every 4th raw sample, first 128 FFT bins) to fit within UART bandwidth at 115200 baud.
|
|
|
|
---
|
|
|
|
## What I Learned
|
|
|
|
**Devicetree and Kconfig** - Devicetree describes what hardware exists (ADC channel on PA0, TIM6 as a basic timer). Kconfig enables software features (CMSIS-DSP, FPU, thread analyzer). They answer different questions and you need both.
|
|
|
|
**Where Zephyr stops and HAL starts** - Zephyr's ADC API doesn't expose hardware timer triggering. For the TIM6 -> ADC1 trigger routing and DMA setup, I had to use STM32 LL functions directly. The devicetree still handles clock enablement and pin configuration, but the actual peripheral interconnection is done in C with register-level calls.
|
|
|
|
**DMA ping-pong buffering** - One contiguous buffer, DMA in circular mode, half-transfer and transfer-complete interrupts. While one half fills, the CPU processes the other. No memcpy, just pointer swapping.
|
|
|
|
**CMSIS-DSP on Cortex-M4F** - arm_rfft_fast_f32 is fast. The FPU matters. Needed to enable specific Kconfig modules (TRANSFORM, COMPLEXMATH, STATISTICS) for each function family used.
|
|
|
|
**UART is the bottleneck** - At 115200 baud you can push maybe 11 KB/s. Sending 1024 raw samples as ASCII text takes longer than the 500ms between frames. Had to decimate the output and move serial reading to a background thread in Python.
|
|
|
|
**Thread analyzer** - Adding a few Kconfig lines gives you per-thread CPU% and stack usage. My processing thread uses 11% of its stack and rounds to 0% CPU. The idle thread runs 98% of the time. Good to know before adding more features.
|
|
|
|
**Timing** - One sample period is 22.7 us (the ADC/DAC tick). One buffer of 1024 samples is 23.2 ms (the processing deadline). Easy to confuse. The per-sample timing is pure hardware. The CPU only needs to keep up at the buffer level.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
[Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|