flash-streaming

#flash-streaming

@tom_doerr: Runs 35B models on 16GB RAM Macs https://github.com/walter-grace/mac-code…

X AI KOLs Timeline ↗ · 2026-05-11 Cached

A tool that enables running large language models like Qwen3.5-35B on 16GB Macs by streaming model weights from SSD, achieving up to 30 tok/s with an optimal configuration.

0 favorites 0 likes

flash-streaming

@tom_doerr: Runs 35B models on 16GB RAM Macs https://github.com/walter-grace/mac-code…

Submit Feedback