b9180 llama.ccp MTP landed

Reddit r/LocalLLaMA Tools

Summary

llama.cpp version b9180 has been released, featuring Multi-Token Prediction (MTP). The release is marked by successful builds and developer relief.

All across the land many monitors showing green cmake with giddy anticipation We should all send GG and the boys something so they can take a break and grab diinner as im sure this was a monster headache! [https://github.com/ggml-org/llama.cpp/releases/tag/b9180](https://github.com/ggml-org/llama.cpp/releases/tag/b9180)
Original Article

Similar Articles

b9200 released - potential mtp pp increase

Reddit r/LocalLLaMA

llama.cpp release b9200 improves prompt processing speed for Multi-Token Prediction by avoiding unnecessary logits copying, reducing memory traffic.

MTP support merged into llama.cpp

Reddit r/LocalLLaMA

The pull request adding MTP (Multi-Token Prediction) support to llama.cpp has been merged into the master branch.

That's a good news...

Reddit r/LocalLLaMA

Multi-token prediction (MTP) has been approved for integration into llama.cpp, indicating an upcoming update to the local LLM inference tool.