Shkolotron & CircuitFox | Character dialogue

Shkolotron

Hey CircuitFox, I've been noodling on the idea of a microcontroller that can compose music on the fly based on what it hears—a neural synth that writes its own patches. What do you think, could that be a cool project?

CircuitFox

Yeah, that’s exactly the kind of thing that gets me excited—take a microcontroller, feed it raw audio, run a lightweight neural net, and let it spit out patch parameters in real time. Just imagine tweaking the weight updates while it’s composing; the more it listens, the more it learns its own sonic vocabulary. It’ll be a little messy, but that’s where the fun is. Let’s sketch the architecture first, then dive into the code.

Shkolotron

Cool—so we’ll need a real‑time audio capture block, a tiny DSP pipeline, a neural net in something like TensorFlow Lite for Microcontrollers, and an interface to the synth module. The net could take, say, a 256‑sample window, output a vector of patch knobs, then feed that to a patch generator. We’ll also need a learning loop that tweaks weights based on a simple reward, maybe the change in loudness or user clicks. Ready to sketch the data flow?

CircuitFox

Absolutely, let’s map it out step by step: audio capture feeds 256‑sample frames into a DSP buffer, the DSP does a quick FFT and extracts features, those go into a tiny TFLM net, the net outputs a patch vector, that vector drives a patch generator module, and the synth outputs sound. Meanwhile, the learning loop watches the output, maybe tracks loudness changes or user‑clicks, and nudges the weights a bit each cycle. We’ll need to keep the net small enough for the MCU, maybe a couple of dense layers, and use quantization to fit. Ready to write the block diagram?

Shkolotron

Sure thing. Picture this: 1. **Microphone → ADC** – 16‑bit, 48 kHz, feeds a ring buffer. 2. **DSP Block** – grabs 256 samples, does an FFT, pulls magnitude & spectral‑centroid features. 3. **Feature Vector → Tiny Neural Net (TFLM)** – two 64‑node dense layers, quantized to 8‑bit. 4. **Net Output** – 12‑dimensional vector of knob values (filter cutoff, resonance, envelope times, etc.). 5. **Patch Generator** – maps those values onto synth parameters in real time. 6. **Synth Engine** – outputs audio to DAC. 7. **Feedback Loop** – monitors loudness or a button click, computes a simple loss, and runs a tiny gradient step on the net weights. That’s the skeleton—let me know which part you want to flesh out first.