Tag
Analysis of DeepSeek V4's top coding scores versus its reported 8-month gap behind the frontier, highlighting differences between narrow benchmark optimization and broader reasoning tests, plus the practical performance hit when running quantized local versions.
A visit to DeepSeek's headquarters reveals its modest origins, young team, and unique culture. The company, operated out of a hedge fund, focuses on staying small and remains unconcerned about AGI risks, instead prioritizing societal concerns like job loss.
This article explains how to use GRPO to fine-tune an LLM (Qwen3-8B) for reliable JSON structured output, improving schema accuracy from 62% to 82%, surpassing GPT-4.1's 58%.
This paper investigates instruction finetuning of DeepSeek-R1-8B using LoRA and NEFTune for financial named-entity recognition, achieving a micro-F1 of 0.912 and outperforming several baseline models.
AI Gateway's May 2026 data shows DeepSeek's token share surged to 17% with minimal spend, while Anthropic retained 65% of spend, indicating cost-conscious routing and growing overall usage.
Introduces FlashMemory DeepSeek-V4 Retriever, a lightweight model that sparsifies DeepSeek-V4's CSA KV-cache by predicting which chunks will be attended to next, keeping only ~10-15% on-device while matching full-attention performance.
The article argues that the AI startup wave is unsustainable because intelligence is an infinitely replicable commodity with zero marginal cost, and most AI companies will collapse by 2029, leaving only a few giants owning the physical layer like energy and chips.
Huawei has open-sourced its CANN software toolkit to compete with Nvidia's CUDA, and DeepSeek V4 shows significant inference performance improvements on Huawei Ascend chips.
This blog post provides tips and benchmarks for achieving nearly 200 tokens per second inference on DeepSeek V4 Flash using vLLM on a dual GH200 workstation, highlighting the use of a quantized checkpoint from Canada-Quant and tensor parallelism optimizations.
MiniMax's price increases and model limitations are driving users away to competitors like DeepSeek and premium options like Claude or ChatGPT, reversing its earlier reputation as a cheap, usable daily driver.
DeepSeek V4 Pro reportedly outperforms GPT-5.5 Pro on precision, suggesting a significant advancement in model accuracy.
A practical sharing on multi-agent AI collaboration, proposing a hierarchical strategy using Opus 4.8 for planning and Deepseek/Gemma for execution, achieving a 10x cost reduction and 2x speed improvement, with open-source implementation.
DeepSeek V4's "Think Max" mode essentially just adds a prompt prefix requiring step-by-step reasoning, sparking debate on the origin of reasoning ability.
A comparison of four frontier AI models (Nemotron 3 Ultra, DeepSeek V4, MiniMax M3, Qwen 3.7 Max) on the same two prompts, with full results linked.
DeepSeek v4 PRO, a 1.6 trillion parameter model, is running via SSD streaming on a 128GB MacBook m5 max, demonstrating local inference of a massive model.
Someone created a repository on GitHub that forwards Claude Code requests to 10 free providers such as DeepSeek and Kimi, allowing users to use Claude Code for free and permanently. Setup takes only five minutes, and over 20,000 developers are already using it.
A discussion comparing DeepSeek V4 Pro, MiMo-V2.5-Pro, and MiniMax M3 for best value in local or openrouter use, with a focus on agentic and coding tasks, and mentions of Hermes Agent and Qwen 3.6 variants.
A brief opinion stating that Moonshot and DeepSeek are the top-tier Chinese AI labs, far ahead of others.
Chinese AI models like DeepSeek and Qwen deliver competitive performance at 5x–20x lower cost than Western counterparts, reshaping the economics of AI and driving multi-model deployment strategies.
Neo Research (新衡), Asia's first independent frontier AI safety evaluation lab, announces its first report: a safety evaluation of DeepSeek v4 Pro.