@axiaisacat: Redis creator antirez drops another hardcore project: ds4. Not just another GGUF runner, but a local inference engine specifically written for DeepSeek V4 Flash: Metal / CUDA 2-bit quantization 1M context KV ...
Summary
Redis creator antirez released ds4, a local inference engine optimized for DeepSeek V4 Flash with 2-bit quantization and support for 1M context KV cache on Metal and CUDA.
View Cached Full Text
Cached at: 05/14/26, 04:40 PM
Redis author antirez has dropped another hardcore project: ds4.
Not just another GGUF runner, but a local inference engine specifically written for DeepSeek V4 Flash:
- Metal / CUDA
- 2-bit quantization
- 1M context
- KV cache can be offloaded to disk
- Targeting high-end Mac and DGX Spark
The focus is not just on “being able to run”, but on turning a local large model into a complete, usable engineering closed loop.
Local AI https://t.co/c2TFtHfQwX
Similar Articles
@VincentLogic: Discovered an amazing open-source project! Redis creator antirez made a splash! ds4 — DeepSeek V4 Flash local inference engine, optimized for Mac Metal, topping GitHub charts for days! And here's the killer part: 128GB…
Redis creator antirez released an open-source project called ds4, a DeepSeek V4 Flash local inference engine optimized for Mac Metal, featuring disk KV caching, ultra-long context, and excellent performance.
@ttasanen: Just fired up DS4 by @antirez on my Mac Studio M3 Ultra 256GB and man, it’s seriously impressive. A clean, purpose-buil…
DS4 is a specialized inference engine by antirez designed to run DeepSeek V4 Flash locally on high-end Mac hardware, featuring optimized KV cache handling and 1M context support.
antirez/deepseek-v4-gguf
Antirez released GGUF quantizations of DeepSeek V4 Flash specifically tailored for the DS4 inference engine, providing optimized configurations for different RAM sizes and enabling local execution of the large MoE model.
DeepSeek 4 Flash local inference engine for Metal
ds4 is a native local inference engine for DeepSeek V4 Flash optimized for Apple Silicon, featuring disk-based KV cache persistence and Metal acceleration.
A few words on DS4
Antirez announces DwarfStar 4 (DS4), a local AI tool that runs DeepSeek v4 Flash with asymmetric 2/8 bit quantization on high-end consumer hardware, achieving near-frontier performance. He discusses the project's rapid popularity, future plans for model updates and distributed inference, and the significance of local AI for serious tasks.