luce-spark

Tag

Cards List
#luce-spark

@sudoingX: anyone running a 16gb card, stop scrolling. @pupposandro and @davideciffa got qwen 35b-a3b down to 13.3gb, measured on …

X AI KOLs Timeline · yesterday Cached

A technique called luce spark allows Qwen 35B-a3B MoE model to run on a 16GB GPU (like RTX 3090) by learning which experts are frequently used and streaming the rest from RAM, achieving ~100 tok/s without VRAM bottleneck.

0 favorites 0 likes
← Back to home

Submit Feedback