@PrajwalTomar_: Most Flash models stop at cheaper and faster. This one is built to actually finish the job. I ran Step 3.7 Flash on a r…
Summary
Step 3.7 Flash is a compact model that handles vision, live data retrieval, and code generation to autonomously build a working dashboard from a screenshot in minutes, costing about 50 cents per session.
View Cached Full Text
Cached at: 06/26/26, 06:14 PM
Most Flash models stop at cheaper and faster. This one is built to actually finish the job.
I ran Step 3.7 Flash on a real task: screenshot of a dashboard in, working app with live crypto prices out.
It read the screenshot, pulled the live prices off the web itself, wrote the frontend and the backend, and ran it. No separate vision module, no me pasting in any data.
The numbers from my run: → working dashboard in about 3 to 4 minutes → around 50 cents for the whole session → vision, live data, and code all from one model
That’s the part that got me. One small, cheap model did the seeing, the searching, and the coding, and actually finished the job.
It’s Step 3.7 Flash from @StepFun_ai.
Honest take: I had to nudge it a couple of times to get everything right. For a fast, cheap model finishing a real multi-step build, I’ll take that.
Similar Articles
Step 3.7 Flash
Step 3.7 Flash is a fast agents model designed to see and act in real time.
Stepfun 3.7 Flash is very good
Stepfun 3.7 Flash is a compact vision model that achieves aesthetics close to GLM 5.1 and 80% of its 3D world understanding, while using only 25% of the parameters, making it highly RAM-efficient.
StepFun 3.7 Flash
StepFun released Step 3.7 Flash, a high-efficiency multimodal model optimized for real-world agentic tasks, featuring improved coding benchmarks (SWE-Bench Pro, Terminal-Bench) and compatibility with multiple agent harnesses.
StepFun Says Step 3.7 Flash Matches 97% of Claude Opus 4.6's Coding Performance at One-Ninth the Cost
StepFun's Step 3.7 Flash, a 198B sparse MoE model with 11B active parameters, matches 97% of Claude Opus 4.6's coding performance on SWE-Bench Verified at roughly one-ninth the cost, using an Advisor Mode strategy that reserves expensive frontier model calls for critical decision points.
@AdinaYakup: Step-3.7-Flash New VL model from @StepFun_ai 198B / 11B active - MoE 256K context 3 reasoning level Up to 400 tokens/sec
StepFun releases Step-3.7-Flash, a new large vision-language MoE model with 198B parameters (11B active), 256K context, and up to 400 tokens/sec inference speed.