Tag
CyberSecQwen-4B is a small, specialized 4B parameter model fine-tuned for defensive cybersecurity tasks, designed to run locally on a single GPU, addressing privacy, cost, and air-gapped deployment needs.
Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.
A 1.7B parameter model surpasses 744B GLM-5 on Schema Guided Dialogue despite corrupted training data, showing 437× size efficiency.
OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.
OpenAI releases GPT-4o mini, a cost-efficient small model priced at 15 cents per million input tokens, 60% cheaper than GPT-3.5 Turbo, with strong performance on MMLU (82%) and outperforming competitors like Gemini Flash and Claude Haiku on reasoning, math, and coding tasks.