Tag
NVIDIA has open-sourced the LocateAnything model, using parallel bounding box decoding technology to predict complete coordinates in one step, fast and accurate. The model has only 3B parameters and can run on consumer-grade GPUs, supporting video object localization, UI recognition, OCR, and other tasks.
MiniCPM5-1B is a new small language model from OpenBMB, apparently built from scratch with its own tokenizer and distinct behavior, generating excitement as a capable 1B model.
Maxime Labonne shares that their model is trending on Hugging Face and is surprisingly capable at agentic tasks despite having only 1B active parameters.
MiniCPM5-1B is a 1B parameter model from OpenBMB that achieves impressive scores on AIME 2025 and τ2-Bench Telecom, outperforming larger models. It features both fast and reasoning modes from a single checkpoint, enabled by a three-stage post-training process including supervised fine-tuning, reinforcement learning, and on-policy distillation.
Surya OCR is a state-of-the-art open-source OCR model with less than 1B parameters, supporting 91 languages and achieving top benchmark scores under 3B params.
The Cactus team distilled Gemini 3.1 into a specialized model called Needle with only 26M parameters, specifically for Function Call. Its performance surpasses Qwen-0.6B, demonstrating the potential of small models in tool calling scenarios.
OpenBMB releases MiniCPM5-1B, a dense 1B Transformer model achieving SOTA among open-source 1B-class models, designed for on-device deployment with hybrid reasoning and long-context support.
Sapient Intelligence has released HRM-Text, a 1B parameter text generation model, trained on only 0.04 trillion tokens (costing approximately $1000), surpassing much larger models trained on 100-1000 times more data on multiple reasoning benchmarks, marking the beginning of a new paradigm for AI training.
Open-sourcing Marlin-2B, a tiny VLM for extracting structured information from videos, fine-tuned to answer 'what is happening and when'. Best open model in its weight class, competitive with Gemini-2.5-flash.
An experiment where six LLMs played Texas Hold'em poker; a tiny 1.2B model won twice due to its aggressive 'never fold' strategy, highlighting how format can favor simpler models. The author built a poker engine and agent framework called Hive, and invites community feedback.
LightOn achieves GPT-5-level deep research retrieval performance using a 150M-parameter late-interaction model, a remarkable feat.
A new paper proposes training a 7B small model via reinforcement learning as a task scheduler, automatically decomposing subtasks and assigning them to top models like GPT-5 and Claude. It surpasses individual frontier models on several hard benchmarks, demonstrating that end-to-end reward learning can effectively replace manual prompt engineering and multi-agent pipeline design.
CyberSecQwen-4B is a small, specialized 4B parameter model fine-tuned for defensive cybersecurity tasks, designed to run locally on a single GPU, addressing privacy, cost, and air-gapped deployment needs.
Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.
A 1.7B parameter model surpasses 744B GLM-5 on Schema Guided Dialogue despite corrupted training data, showing 437× size efficiency.
OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.
OpenAI releases GPT-4o mini, a cost-efficient small model priced at 15 cents per million input tokens, 60% cheaper than GPT-3.5 Turbo, with strong performance on MMLU (82%) and outperforming competitors like Gemini Flash and Claude Haiku on reasoning, math, and coding tasks.