Tag
The article questions why ternary language models like BitNet have not scaled beyond 2B parameters, given their initial promise, and discusses the apparent lack of progress from open-weight AI labs.
A discussion on the lack of a community project for training LLMs from scratch on consumer hardware (8GB VRAM) using modern techniques like BitNet and Muon, proposing a collaborative effort to build one.
New BitCPM4-CANN models (1B, 3B, 8B) from OpenBMB released on Hugging Face; awaiting llamacpp support for testing.