MiniMax M3 is starting to rollout on the API
Summary
MiniMax is rolling out its M3 model on the API, featuring a 1,000,000 token context window.
Similar Articles
What breaks the most when you call LLM APIs in production?
A discussion of common errors when calling LLM APIs in production, including rate limits, format mismatches, malformed responses, context overflow, model deprecation, and silent failures, with statistics from Datadog and a cited paper.
Spent the weekend on the Apodex 4b, plus a quick look at the 35b mini
The author tests the Apodex 4B-SFT and 35B mini models, finding the 4B-SFT surpasses other 4B models in multi-hop search tasks without hallucination, and notes the design philosophy of separating answer checking from generation.
🚀PP-OCRv6 is officially released !
PaddleOCR releases PP-OCRv6, a new OCR model series with sizes from 1.5M to 34.5M parameters, offering improved accuracy and faster inference, supporting 50 languages and new scenarios like PCB and CAD drawings, under Apache 2.0 open source license.
Minimax M3 sm_120
Minimax's M3 model requires vllm updates to support sm_120 compute capability, as the current repo only supports sm_100.
I think long context agents are failing in a very boring way
An opinion piece arguing that long context windows don't equate to memory and that agent failures are often mundane, like forgetting constraints or rereading files, emphasizing that reliability depends on context architecture decisions.