high-speed

#high-speed

5.6 Sol is coming to Cerebras at 750 tokens per second in July

Reddit r/singularity ↗ · 19h ago

The 5.6 Sol model is coming to Cerebras hardware in July, offering inference at 750 tokens per second.

0 favorites 0 likes

#high-speed

@AdinaYakup: Step-3.7-Flash New VL model from @StepFun_ai 198B / 11B active - MoE 256K context 3 reasoning level Up to 400 tokens/sec

X AI KOLs Timeline ↗ · 2026-05-29 Cached

StepFun releases Step-3.7-Flash, a new large vision-language MoE model with 198B parameters (11B active), 256K context, and up to 400 tokens/sec inference speed.

0 favorites 0 likes

high-speed

5.6 Sol is coming to Cerebras at 750 tokens per second in July

@AdinaYakup: Step-3.7-Flash New VL model from @StepFun_ai 198B / 11B active - MoE 256K context 3 reasoning level Up to 400 tokens/sec

Submit Feedback