Harvey partnered with Applied Compute to train a legal agent, optimizing the agent stack and post-training the GLM-5.1 model using reward signals from their Legal Agent Benchmark.
An updated GPT-5.5 Cyber model surpasses Mythos 5 in the CyberGym benchmark.
Cursor AI announced three updates from its Compile keynote, including training a new model in collaboration with SpaceX.
NVIDIA's nvfp4 quantized version of Qwen 3.6 35B is recommended over the Unsloth variant, offering better performance. The model is available on HuggingFace for use in AI applications.
A tweet reacts to a claim that AEON-7 is the best model distill, leaving even the creator speechless.
OpenAI releases the full version of GPT-5.5-Cyber, a cybersecurity-focused AI model with state-of-the-art performance on CyberGym, and announces efforts to improve security through Patch The Planet and Codex Security.
Qwopus 3.6 27B-Coder-Compat is a new GGUF release with compatibility fixes for various harnesses, reducing looping and improving thinking stability. It can generate full HTML games and is suitable for local deployment.
According to leaked information, OpenAI's GPT‑5.6 is nearly complete, expected to have a 1.5 million context window, priced at only one-third of Fable 5, with significantly improved front-end generation quality. Release may be delayed until after the 28th.
SingGuard is a multimodal guardrail system from Ant Group that treats safety policy as an input, allowing dynamic adaptation via natural language. It is released under Apache 2.0 and covers text and image modalities.
PaddlePaddle releases Unlimited-OCR, a new OCR model using Reference Sliding Window Attention (R-SWA) to maintain constant KV cache during decoding, achieving 93% on OmniDocBench and a 6% improvement over previous methods.
A fine-tuned version of Qwen3.5-MoE called NEX-N2-mini reportedly fixes overthinking issues seen in Qwen 3.5 and 3.6 models.
Release of Gemma4-12B-QAT Uncensored Balanced, a fine-tuned uncensored model with a multi-token-prediction draft head for ~60% faster speculative decoding, optimized for llama.cpp and offering vision support.
Running GLM5.2 with 7 trillion tokens on a budget setup using 4x RTX 3090 GPUs and 192GB RAM.
Ai2 released TMax 27B, a 27B terminal agent that achieves 42.7% on Terminal Bench 2.0, rivaling models 40 times its size.
PP-OCRv6 is the latest generation of PaddleOCR's universal OCR model family, offering three tiers from 1.5M to 34.5M parameters, supporting 50 languages, and achieving significant accuracy improvements over previous versions.
The open-source OCR model Unlimited OCR, based on DeepSeek OCR, achieves 93.23 on OmniDocBench v1.5 with only 3B parameters, outperforming DeepSeek OCR, Gemini 2.5, and others.
The Gemma 4 QAT 31B model demonstrates improved behavior with KV cache quantization, suggesting enhanced inference efficiency.
Surya 2 is a 650M parameter OCR model achieving 83.3% on olmocr, claiming to be the most accurate small OCR model thanks to character tokenization which improves accuracy and model size.
Baidu has open-sourced the Unlimited OCR model, which uses a Reference Sliding Window Attention (R-SWA) mechanism to parse documents up to 32K context in a single pass, eliminating the need for page-by-page inference.
GLM 5.2 is a new open-weights model from Z.ai, compared against Claude Opus in a 3D game coding task. Opus performed faster and cleaner, but GLM 5.2 offers compelling cost and accessibility advantages.