qwen-2-5

Tag

Cards List
#qwen-2-5

@TheAhmadOsman: 3B model with Opus 4.5 performance VibeThinker 3B (based on Qwen 2.5)

X AI KOLs Following · yesterday Cached

Ahmad Osman announces VibeThinker 3B, a 3-billion-parameter model based on Qwen 2.5 that claims performance comparable to Claude Opus 4.5, predicting local deployment on consumer hardware.

0 favorites 0 likes
#qwen-2-5

How difficult is distilling?

Reddit r/LocalLLaMA · 2026-05-08

该文章探讨了模型蒸馏的难度和成本,以DeepSeek R1蒸馏到Llama 3 8b和Qwen 2.5 7b为例,询问为何蒸馏模型不常见。

0 favorites 0 likes
← Back to home

Submit Feedback