Deepseek v4 Flash is pretty amazing, about to buy a $25k computer

Reddit r/openclaw 05/09/26, 03:30 PM News

Summary

The author praises DeepSeek V4 Flash for enabling high-performance local LLM deployment, leading to a $25k hardware purchase to serve clients with strict data privacy needs.

My customers have confidential data, they won't even use AWS. I've been trying to solve this problem for them and they are more than fine with buying an on-premise device for Local LLMs + AI Agents. Up until today, I have been extremely dissapointed with every model not named Opus. However, Deepseek 4 Flash is doing near-Opus level performance. This is something I can actually use. Upon this whole process things I dont understand: >How are Qwen 35b people are using it? Not even sonnet can do the job. >Do Mac users just say they are using local LLMs but not actually? That stuff is unbelievably slow. Heck, even with NVIDIA GPUs, it can be a bit frustrating when doing 1M tokens. Anyway, thanks China for the free LLM. Not sure what they get out of it, I'm running it locally.

Original Article

Similar Articles

@ciruai: Testing DeepSeek v4 Flash on the AMD Ryzen AI Max+ 395 Strix Halo with 128GB RAM. Getting ~15 TPS over a decently long …

X AI KOLs Timeline

Testing DeepSeek v4 Flash on the AMD Ryzen AI Max+ 395 with 128GB RAM achieves ~15 TPS for a 284B MoE model (13B active) locally, costing $3,000 versus $25,000+ for a datacenter setup, highlighting the feasibility of running large models on consumer hardware.

Deepseek v4 Flash is pretty amazing, about to buy a $25k computer

Similar Articles

@ciruai: Testing DeepSeek v4 Flash on the AMD Ryzen AI Max+ 395 Strix Halo with 128GB RAM. Getting ~15 TPS over a decently long …

You can run Deepseek 4 flash on mac (M3 Max, 96gb)

DeepSeek 4 Flash local inference engine for Metal

DeepSeek-V4-Flash means LLM steering is interesting again

Deepseek V4 flash performance on DGX Spark

Submit Feedback