Tag
The author has open-sourced a novel KV-cache solution called catalyst-brain, claiming to dramatically reduce RAM usage for local models and potentially enable infinite context windows.