DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

Reddit r/MachineLearning 05/25/26, 06:22 PM Papers

risc-v microcontroller dcgan inference quantization edge-ai tiny-machine-learning

Summary

Demonstrates running a DCGAN with 12.6M int8 quantized parameters on a low-cost RISC-V microcontroller (CH32H417), generating 64x64 cat faces in 26 seconds using pure C inference and quantum entropy sampling.

Just thought I'd share, I ran a DCGAN on a dual core RISC-V microcontroller, the CH32H417 generating 64x64 cat faces. This is a new RISC-V MCU, so no TFLite, no CMSIS NN and no external memory. It's a pure C inference engine, bit-identical to PyTorch reference outputs. The model is 12.6M parameters with int8 per channel quantization. Intermediate activations are stored in DTCM and layer weights stream from SD card using double buffering so the next layer loads while the current one computes. The total available SRAM is 512KB shared between both cores and the inference engine and time to generate one image is 26 seconds, it could be faster, but SD card access speed is the bottleneck rather than computation. The z vector is seeded from 200 bytes of quantum random data (ANU QRNG vacuum fluctuation source), transformed via Box-Muller into the latent vector. which is not strictly necessary for image quality but it was a fun constraint for the art installation side of the project. The generated cat is classified as "motivated" or "demotivated" based on a single quantum bit, which selects from a phrase bank with four fragment slots combining into one of 131,072 possible spoken verdicts output through the onboard DAC... As far as I can tell nobody else is running GAN inference on these low cost RISC-V microcontrollers, cause ARM has the CMSIS NN ecosystem for this kind of thing but RISC-V MCUs especially in the CH32 space have nothing, so the entire inference engine is written from scratch. Paper: [TinyGAN: Generative Image Synthesis on a RISC-V Microcontroller with Quantum Entropy Sampling](https://zenodo.org/records/20371371)

Original Article

DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

Similar Articles

Running a 13M ASR conformer on a microcontroller

Running a 28.9M parameter LLM on an $8 microcontroller

CPU-only inference on a Celeron N5095 SBC: 6 models from 0.6B to 8B, benchmarked

Qwen3.6 27B Pure Quant: 40 tok/s on 16 GB VRAM

Training a vision model from scratch on iPod touch 4 images

Submit Feedback

Similar Articles

Running a 13M ASR conformer on a microcontroller

Running a 28.9M parameter LLM on an $8 microcontroller

CPU-only inference on a Celeron N5095 SBC: 6 models from 0.6B to 8B, benchmarked

Qwen3.6 27B Pure Quant: 40 tok/s on 16 GB VRAM

Training a vision model from scratch on iPod touch 4 images
Trained a DCGAN from scratch on 350 photos of a red solo cup taken with an iPod touch 4, producing results reminiscent of early DALL-E.