throughput

Tag

Cards List
#throughput

@msimoni: One thing I keep thinking about: with S3-like object storage as primitive, you could build a transactional database wit…

X AI KOLs Timeline · 6d ago

A tweet discusses the idea of building a transactional database with infinite throughput using S3-like object storage and content-addressing, where blocks are written in parallel and the root hash is updated periodically.

0 favorites 0 likes
#throughput

@rohanpaul_ai: Amazon unveiled “Resilient Network Graphs,” (RNG) a data center network that reduces hardware needs by 69% and raises t…

X AI KOLs Following · 2026-05-30 Cached

Amazon unveiled 'Resilient Network Graphs' (RNG), a data center network design that reduces hardware needs by 69% and increases throughput by 33%, now default for most AWS workloads after quiet deployment since last year.

0 favorites 0 likes
#throughput

Qwen 3.6 benchmarks on 2x RTX PRO 6000

Reddit r/LocalLLaMA · 2026-05-25

Benchmarks for Qwen 3.6 27B and 35B models on dual RTX PRO 6000 GPUs using VLLM, showing generation throughput up to 3500 tokens per second.

0 favorites 0 likes
#throughput

Can a 5090 with qwen3.6 achieve > 3,000 tok/s ? bring your pitchforks (open-dllm)

Reddit r/LocalLLaMA · 2026-05-16

Open-dLLM adapts Qwen3.6 to use diffusion-based generation, achieving over 3,000 tok/s on an RTX 5090 for short sequences, with code released on GitHub.

0 favorites 0 likes
#throughput

@QuixiAI: @Kimi_Moonshot K2.6 running on my mi300x, 56 tps (single request). I will run a throughput test

X AI KOLs Following · 2026-04-21 Cached

Kimi K2.6 achieves 56 tokens per second on a single MI300X GPU; user plans further throughput benchmarking.

0 favorites 0 likes
#throughput

@sanbuphy: K2.6 successfully downloaded and deployed the Qwen3.5-0.8B model locally on a Mac, using the niche Zig language to implement and optimize inference, demonstrating the new model’s generalization ability. After 4,000+ tool calls and 12+ hours of continuous operation, K2.6 iterated 14 times…

X AI KOLs Timeline · 2026-04-21 Cached

K2.6 successfully downloaded and deployed the Qwen3.5-0.8B model locally on a Mac, using the niche Zig language to implement and optimize inference, demonstrating the new model’s generalization ability. After 4,000+ tool calls and 12+ hours of continuous operation, K2.6 iterated 14 times, boosting throughput from ~15 tokens/s to ~193 tokens/s, ultimately achieving 20% faster inference than LM Studio.

0 favorites 0 likes
← Back to home

Submit Feedback