Tag
A technique called luce spark allows Qwen 35B-a3B MoE model to run on a 16GB GPU (like RTX 3090) by learning which experts are frequently used and streaming the rest from RAM, achieving ~100 tok/s without VRAM bottleneck.
A novel two-pass ablation technique (ASPA) applied to Gemma-4-12B achieves zero refusal rate with zero capability loss, using source-tethering to recover benchmark performance.
Fern announces a new regularization technique that solves the SolidGoldMagikarp stability problem, with details to follow in the thread.
Explains how to use Claude to perform a premortem, a technique by Daniel Kahneman, to stress-test plans by imagining they have already failed.
The author describes a technique called 'Stream of Consciousness Driven Development' where during pair programming, they write a detailed markdown file exploring a problem and solution before making changes, to ensure both partners fully understand the reasoning.