silx-ai/Quasar-Preview • Huggingface (5M context length)
Summary
silx-ai released Quasar-Preview, a model with 5 million token context length, available on Hugging Face.
Similar Articles
silx-ai/Quasar-Preview
SILX AI releases Quasar-Preview, an 18B parameter MoE foundation model with 2B active parameters and experimental 5M-token context, built on a hybrid recurrent/attention architecture and designed for decentralized training via Bittensor SN24.
mindlab-research/Macaron-V1-Preview-749B • Huggingface
mindlab-research releases Macaron-V1-Preview-749B, a 749 billion parameter large language model, available on Hugging Face.
@0xSero: GLM-5.1-478B-NVFP4 Running on: - 4x RTX Pro 6000 - Sglang - 370,000 max tokens (1.75x full context) - p10 27.7 | p90 45…
A quantized 478B-parameter GLM-5.1 model runs on 4×RTX Pro 6000 GPUs via SGLang, delivering 370k-token context at up to 45 tok/s decode and 1340 tok/s prefill, and is demoed driving Figma.
Subquadratic AI introduces SubQ-1.1-Small, a new model using Smart Sparse Attention
Subquadratic AI introduces SubQ-1.1-Small, a model leveraging Smart Sparse Attention to achieve near-perfect long-context retrieval up to 12M tokens with up to 1,000x attention compute reduction. It balances long-context optimization with strong general reasoning, outperforming baselines on benchmarks like NIAH and RULER.
@_akhaliq: LongCat-2.0 dropping on Hugging Face soon
LongCat-2.0, a model update, is being released on Hugging Face soon.