StepFun 3.7 Flash - Speed Benchmark in M5 Max
Summary
Benchmark results for StepFun 3.7 Flash model running on M5 Max via llama.cpp, showing prompt processing and token generation speeds across various context lengths.
Similar Articles
StepFun 3.7 Flash
StepFun released Step 3.7 Flash, a high-efficiency multimodal model optimized for real-world agentic tasks, featuring improved coding benchmarks (SWE-Bench Pro, Terminal-Bench) and compatibility with multiple agent harnesses.
StepFun Says Step 3.7 Flash Matches 97% of Claude Opus 4.6's Coding Performance at One-Ninth the Cost
StepFun's Step 3.7 Flash, a 198B sparse MoE model with 11B active parameters, matches 97% of Claude Opus 4.6's coding performance on SWE-Bench Verified at roughly one-ninth the cost, using an Advisor Mode strategy that reserves expensive frontier model calls for critical decision points.
Stepfun 3.7 Flash is very good
Stepfun 3.7 Flash is a compact vision model that achieves aesthetics close to GLM 5.1 and 80% of its 3D world understanding, while using only 25% of the parameters, making it highly RAM-efficient.
@AdinaYakup: Step-3.7-Flash New VL model from @StepFun_ai 198B / 11B active - MoE 256K context 3 reasoning level Up to 400 tokens/sec
StepFun releases Step-3.7-Flash, a new large vision-language MoE model with 198B parameters (11B active), 256K context, and up to 400 tokens/sec inference speed.
@NielsRogge: Impressive release by StepFun, explore it at https://paperswithcode.co/paper/83892
StepFun releases Step 3.7 Flash, an open-weight model designed for agentic, coding, search, and multimodal tasks, achieving top scores on several benchmarks.