training-recipe

#training-recipe

Qwen-Image-Flash: Beyond Objective Design

Hugging Face Daily Papers ↗ · 5d ago Cached

This paper investigates training recipes for few-step distillation of visual generative models, using Qwen-Image-2.0 as a case study. It reveals non-obvious behaviors and proposes Qwen-Image-Flash.

0 favorites 0 likes

#training-recipe

MiniCPM5-1B Shows Why the Small-Model Race Isn't Over

Reddit r/ArtificialInteligence ↗ · 2026-05-31 Cached

MiniCPM5-1B is a 1B parameter model from OpenBMB that achieves impressive scores on AIME 2025 and τ2-Bench Telecom, outperforming larger models. It features both fast and reasoning modes from a single checkpoint, enabled by a three-stage post-training process including supervised fine-tuning, reinforcement learning, and on-policy distillation.

0 favorites 0 likes

training-recipe

Qwen-Image-Flash: Beyond Objective Design

MiniCPM5-1B Shows Why the Small-Model Race Isn't Over

Submit Feedback