dspark

#dspark

@karminski3: DeepSeek truly excels in both cost-effectiveness and technology... Some classmates don't understand what DSpark is, so here's a quick tutorial. Speculative decoding is a technique to improve the output speed of large models. The essence is to let a small model generate text for the large model to check. Because currently...

X AI KOLs Timeline ↗ · 17h ago Cached

DeepSeek proposes the DSpark technique, which implements speculative decoding by inserting a mini Transformer after the Final RMSNorm, boosting large model output speed by 60%-85%.

0 favorites 0 likes

#dspark

@Hikari_07_jp: Progress report! Training of the DFlash backbone and markov head is complete, enabling DSpark to be used on 27B. We wil…

X AI KOLs Timeline ↗ · yesterday Cached

Progress update on DSpark: training of DFlash backbone and markov head is complete, enabling use on 27B. Next is training the confidence head for adaptive drafting, expected 8-14% speed improvement over DFlash.

0 favorites 0 likes

#dspark

@Michaelzsguo: This is the best read on DeepSeek’s recent innovation, DSpark: Think of DSpark as: The main model rapidly brainstorms t…

X AI KOLs Timeline ↗ · yesterday Cached

DeepSeek released DSpark, a system where the main model rapidly generates a sentence while a tiny editor fixes coherence before verification, pushing LLM systems engineering beyond new architecture.

0 favorites 0 likes

#dspark

@SuJinYan123: Just 6 hours after DeepSeek open-sourced the Qwen DSpark weights, OpenInfer already has DSpark support running on RTX 5…

X AI KOLs Timeline ↗ · yesterday Cached

OpenInfer, a pure Rust+CUDA LLM inference engine, quickly added support for DeepSeek's DSpark speculative decoding technique on RTX 5090, achieving nearly 500 tok/s per user and scaling to ~2.4K aggregate tok/s, outperforming DFlash on non-random workloads.

0 favorites 0 likes

#dspark

@maximelabonne: Fun surprise: DeepSeek used my open-perfectblend dataset to train their new DSpark drafter Time to promote it again! It…

X AI KOLs Following ↗ · 3d ago Cached

DeepSeek used the open-perfectblend dataset to train their new DSpark drafter; the dataset is an open-source reproduction of 'The Perfect Blend' paper providing over 1 million diverse prompts in math, chat, and code.

0 favorites 0 likes

#dspark

@danielhanchen: DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%!…

X AI KOLs Timeline ↗ · 3d ago Cached

DeepSeek released DSpark, a speculative decoding method that boosts throughput by 51% to 400% for V4 Flash & Pro, along with the open-source DeepSpec codebase for training and evaluating draft models.

0 favorites 0 likes

dspark

@Hikari_07_jp: Progress report! Training of the DFlash backbone and markov head is complete, enabling DSpark to be used on 27B. We wil…

@Michaelzsguo: This is the best read on DeepSeek’s recent innovation, DSpark: Think of DSpark as: The main model rapidly brainstorms t…

@SuJinYan123: Just 6 hours after DeepSeek open-sourced the Qwen DSpark weights, OpenInfer already has DSpark support running on RTX 5…

@maximelabonne: Fun surprise: DeepSeek used my open-perfectblend dataset to train their new DSpark drafter Time to promote it again! It…

@danielhanchen: DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%!…

Submit Feedback