Tag
0xSero has released new FP8 and NVFP4 quantized versions of the Tencent Hy3-preview model, enabling it to run on 256GB VRAM with full context.
The article discusses Tencent's AI capex constraints due to NVIDIA chip shortages and its recent shift to using Kunlun chips, analyzing the company's valuation and strategic positioning in the AI landscape.
UniPrefill is a new prefill acceleration framework proposed in a research paper that enables block-wise dynamic sparsification for universal long-context processing in LLMs. It integrates with vLLM to achieve up to 2.1x speedup in Time-To-First-Token across various model architectures.
Tencent releases Hy3-preview, a 295B-parameter MoE model with 21B active parameters that excels in STEM reasoning, instruction following, coding and agent tasks.
Tencent and Alibaba are reportedly in talks to invest in Chinese AI startup DeepSeek at a valuation exceeding $20 billion.
Tencent releases MegaStyle, a large-scale open-source style transfer model with full training/inference code, 1.4M dataset, and pre-trained models.