small-model

#small-model

CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models

Hugging Face Blog ↗ · 21h ago Cached

CyberSecQwen-4B is a small, specialized 4B parameter model fine-tuned for defensive cybersecurity tasks, designed to run locally on a single GPU, addressing privacy, cost, and air-gapped deployment needs.

1 favorites 1 likes

#small-model

@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…

X AI KOLs Timeline ↗ · yesterday

Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.

0 favorites 0 likes

#small-model

@j_golebiowski: A 1.7B parameter model beats GLM-5 (744B) on Schema Guided Dialogue — even when the training data is corrupted. That's …

X AI KOLs Timeline ↗ · 2026-04-21 Cached

A 1.7B parameter model surpasses 744B GLM-5 on Schema Guided Dialogue despite corrupted training data, showing 437× size efficiency.

0 favorites 0 likes

#small-model

OpenAI o3-mini

OpenAI Blog ↗ · 2025-01-31 Cached

OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.

0 favorites 0 likes

#small-model

GPT-4o mini: advancing cost-efficient intelligence

OpenAI Blog ↗ · 2024-07-18 Cached

OpenAI releases GPT-4o mini, a cost-efficient small model priced at 15 cents per million input tokens, 60% cheaper than GPT-3.5 Turbo, with strong performance on MMLU (82%) and outperforming competitors like Gemini Flash and Claude Haiku on reasoning, math, and coding tasks.

0 favorites 0 likes

small-model

CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models

@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…

@j_golebiowski: A 1.7B parameter model beats GLM-5 (744B) on Schema Guided Dialogue — even when the training data is corrupted. That's …

OpenAI o3-mini

GPT-4o mini: advancing cost-efficient intelligence

Submit Feedback