@losterror501: with 2dgx sparks getting 25tok/sec with 1 session and it peaks to 152tok/sec with 8 sessions. Actually insane...

X AI KOLs Timeline Models

Summary

Announcement of Qwable-v1, an open-weights model distilled from Claude Fable-5, along with performance benchmarks on 2dgx sparks hardware achieving 25 tok/sec (single session) and 152 tok/sec (8 sessions).

with 2dgx sparks getting 25tok/sec with 1 session and it peaks to 152tok/sec with 8 sessions. Actually insane...
Original Article
View Cached Full Text

Cached at: 06/22/26, 01:41 AM

with 2dgx sparks getting 25tok/sec with 1 session and it peaks to 152tok/sec with 8 sessions. Actually insane…

Taha ז (@lordx64): Releasing Qwable-v1 - an open-weights Qwen3.6-35B-A3B distilled from Claude Fable-5, Anthropic’s Mythos-class preview model that was briefly public for ~4days (2026-06-9 → 2026-06-12) before being suspended globally under U.S. export-control directives.

Fable-5 was Anthropic’s

Similar Articles

DGX Spark agentic usage numbers

Reddit r/LocalLLaMA

A user shares benchmark results and configuration for running Qwen3.6 models on NVIDIA DGX Spark using vLLM, focusing on agentic workloads with concurrent requests and tool calling.