small-model

#small-model

@VincentLogic: NVIDIA's newly open-sourced LocateAnything model is really impressive. The previous visual grounding models generated coordinates digit by digit (like squeezing toothpaste), slow and unstable. This new model uses "parallel bounding box decoding" to predict complete coordinates in one step, much faster and more accurate...

X AI KOLs Timeline ↗ · 2026-06-03 Cached

NVIDIA has open-sourced the LocateAnything model, using parallel bounding box decoding technology to predict complete coordinates in one step, fast and accurate. The model has only 3B parameters and can run on consumer-grade GPUs, supporting video object localization, UI recognition, OCR, and other tasks.

0 favorites 0 likes

#small-model

MiniCPM5 1B - what is it?

Reddit r/LocalLLaMA ↗ · 2026-06-01

MiniCPM5-1B is a new small language model from OpenBMB, apparently built from scratch with its own tokenizer and distinct behavior, generating excitement as a capable 1B model.

0 favorites 0 likes

#small-model

@maximelabonne: We're trending on @huggingface! Tbh, we undersold this model. It's a lot more capable at agentic tasks than I expected.…

X AI KOLs Following ↗ · 2026-06-01 Cached

Maxime Labonne shares that their model is trending on Hugging Face and is surprisingly capable at agentic tasks despite having only 1B active parameters.

0 favorites 0 likes

#small-model

MiniCPM5-1B Shows Why the Small-Model Race Isn't Over

Reddit r/ArtificialInteligence ↗ · 2026-05-31 Cached

MiniCPM5-1B is a 1B parameter model from OpenBMB that achieves impressive scores on AIME 2025 and τ2-Bench Telecom, outperforming larger models. It features both fast and reasoning modes from a single checkpoint, enabled by a three-stage post-training process including supervised fine-tuning, reinforcement learning, and on-policy distillation.

0 favorites 0 likes

#small-model

@akshay_pachaar: - <1B params - supports 91 languages - 5 pages/s on RTX 5090 - runs on CPU, GPU, MPS - 83.3% olmocr bench score (top un…

X AI KOLs Following ↗ · 2026-05-30 Cached

Surya OCR is a state-of-the-art open-source OCR model with less than 1B parameters, supporting 91 languages and achieving top benchmark scores under 3B params.

0 favorites 0 likes

#small-model

@sitinme: A 26M parameter model can do Function Call, and is even stronger than Qwen-0.6B? This team's out-of-the-box approach is too wild! Nowadays, large models have ever-growing parameter counts, but one question has never been seriously considered: does calling a tool really need hundreds of billions of parameters? Think about it, when you say 'Check today's...'

X AI KOLs Timeline ↗ · 2026-05-26 Cached

The Cactus team distilled Gemini 3.1 into a specialized model called Needle with only 26M parameters, specifically for Function Call. Its performance surpasses Qwen-0.6B, demonstrating the potential of small models in tool calling scenarios.

0 favorites 0 likes

#small-model

MiniCPM5-1B

Reddit r/LocalLLaMA ↗ · 2026-05-25 Cached

OpenBMB releases MiniCPM5-1B, a dense 1B Transformer model achieving SOTA among open-source 1B-class models, designed for on-device deployment with hybrid reasoning and long-context support.

0 favorites 0 likes

#small-model

HRM Seems To Be Going Off Right Now

Reddit r/LocalLLaMA ↗ · 2026-05-19 Cached

Sapient Intelligence has released HRM-Text, a 1B parameter text generation model, trained on only 0.04 trillion tokens (costing approximately $1000), surpassing much larger models trained on 100-1000 times more data on multiple reasoning benchmarks, marking the beginning of a new paradigm for AI training.

0 favorites 0 likes

#small-model

@HappyyPablo: open sourcing Marlin-2B a tiny VLM to extract structured information from videos Marlin is finetuned for two questions …

X AI KOLs Timeline ↗ · 2026-05-19 Cached

Open-sourcing Marlin-2B, a tiny VLM for extracting structured information from videos, fine-tuned to answer 'what is happening and when'. Best open model in its weight class, competitive with Gemini-2.5-flash.

1 favorites 1 likes

#small-model

I Made LLMs Play Texas Hold’em. The Smallest Model Beat a ~1T Model by Being Too Dumb to Fold

Reddit r/singularity ↗ · 2026-05-19

An experiment where six LLMs played Texas Hold'em poker; a tiny 1.2B model won twice due to its aggressive 'never fold' strategy, highlighting how format can favor simpler models. The author built a poker engine and agent framework called Hive, and invites community feedback.

0 favorites 0 likes

#small-model

@KrzakalaF: LightOn getting GPT-5-level Deep Research retrieval performance with a 150M-parameter late-interaction model is honestl…

X AI KOLs Following ↗ · 2026-05-13 Cached

LightOn achieves GPT-5-level deep research retrieval performance using a 150M-parameter late-interaction model, a remarkable feat.

0 favorites 0 likes

#small-model

@berryxia: Small model, big wisdom? It's now real! A 7B small model now acts as the boss of top large models like GPT-5, Claude Sonnet 4, Gemini 2.5 Pro. A new paper shows an RL-trained 7B model learned to write natural language subtasks, assign them to different models, precisely...

X AI KOLs Timeline ↗ · 2026-05-11

A new paper proposes training a 7B small model via reinforcement learning as a task scheduler, automatically decomposing subtasks and assigning them to top models like GPT-5 and Claude. It surpasses individual frontier models on several hard benchmarks, demonstrating that end-to-end reward learning can effectively replace manual prompt engineering and multi-agent pipeline design.

0 favorites 0 likes

#small-model

CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models

Hugging Face Blog ↗ · 2026-05-08 Cached

CyberSecQwen-4B is a small, specialized 4B parameter model fine-tuned for defensive cybersecurity tasks, designed to run locally on a single GPU, addressing privacy, cost, and air-gapped deployment needs.

1 favorites 1 likes

#small-model

@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…

X AI KOLs Timeline ↗ · 2026-05-08

Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.

0 favorites 0 likes

#small-model

@j_golebiowski: A 1.7B parameter model beats GLM-5 (744B) on Schema Guided Dialogue — even when the training data is corrupted. That's …

X AI KOLs Timeline ↗ · 2026-04-21 Cached

A 1.7B parameter model surpasses 744B GLM-5 on Schema Guided Dialogue despite corrupted training data, showing 437× size efficiency.

0 favorites 0 likes

#small-model

OpenAI o3-mini

OpenAI Blog ↗ · 2025-01-31 Cached

OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.

0 favorites 0 likes

#small-model

GPT-4o mini: advancing cost-efficient intelligence

OpenAI Blog ↗ · 2024-07-18 Cached

OpenAI releases GPT-4o mini, a cost-efficient small model priced at 15 cents per million input tokens, 60% cheaper than GPT-3.5 Turbo, with strong performance on MMLU (82%) and outperforming competitors like Gemini Flash and Claude Haiku on reasoning, math, and coding tasks.

0 favorites 0 likes

small-model

Submit Feedback