Tag
A community member details a custom PC build using discontinued Intel Optane Persistent Memory to successfully run the 1-trillion parameter Kimi K2.5 model locally at roughly 4 tokens per second via llama.cpp.