Tag
A developer forked ik_llama.cpp and added a '--numa mirror' mode that duplicates model weights and KV cache across NUMA nodes to maximize multi-socket CPU inference performance, sharing benchmarks and seeking testers.