@maximelabonne: Fun surprise: DeepSeek used my open-perfectblend dataset to train their new DSpark drafter Time to promote it again! It…
Summary
DeepSeek used the open-perfectblend dataset to train their new DSpark drafter; the dataset is an open-source reproduction of 'The Perfect Blend' paper providing over 1 million diverse prompts in math, chat, and code.
View Cached Full Text
Cached at: 06/27/26, 03:58 PM
Fun surprise: DeepSeek used my open-perfectblend dataset to train their new DSpark drafter
Time to promote it again! It’s an open-source reproduction of “The Perfect Blend” paper.
If you ever need >1M diverse prompts in math, chat, and code, it does the job. https://t.co/eWrwoGCqSI
Similar Articles
@danielhanchen: DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%!…
DeepSeek released DSpark, a speculative decoding method that boosts throughput by 51% to 400% for V4 Flash & Pro, along with the open-source DeepSpec codebase for training and evaluating draft models.
@Michaelzsguo: This is the best read on DeepSeek’s recent innovation, DSpark: Think of DSpark as: The main model rapidly brainstorms t…
DeepSeek released DSpark, a system where the main model rapidly generates a sentence while a tiny editor fixes coherence before verification, pushing LLM systems engineering beyond new architecture.
deepseek-ai/DeepSeek-V4-Flash-DSpark
DeepSeek releases V4 series of Mixture-of-Experts language models (Pro 1.6T/49B activated, Flash 284B/13B activated) supporting one-million-token context with hybrid attention and speculative decoding, claiming best open-source model performance.
deepseek-ai/DeepSeek-V4-Pro-DSpark
DeepSeek releases preview versions of its V4 series, including DeepSeek-V4-Pro (1.6T parameters, 49B activated) and DeepSeek-V4-Flash (284B parameters, 13B activated), both supporting a one-million-token context and featuring hybrid attention, manifold-constrained hyper-connections, and a Muon optimizer.
@SuJinYan123: Just 6 hours after DeepSeek open-sourced the Qwen DSpark weights, OpenInfer already has DSpark support running on RTX 5…
OpenInfer, a pure Rust+CUDA LLM inference engine, quickly added support for DeepSeek's DSpark speculative decoding technique on RTX 5090, achieving nearly 500 tok/s per user and scaling to ~2.4K aggregate tok/s, outperforming DFlash on non-random workloads.