power-efficiency

#power-efficiency

Tiny LLM Benchmark: Jetson Orin Nano Super 8GB - Four Power Modes × Eight Models

Reddit r/LocalLLaMA ↗ · 2d ago

A deep benchmark of 8 tiny LLMs (135M to 1B parameters) on a $250 Jetson Orin Nano Super across four power modes finds 25W to be Pareto-optimal, with SmolLM2-135M achieving 165.1 tok/s and best efficiency.

0 favorites 0 likes

#power-efficiency

Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).

Reddit r/LocalLLaMA ↗ · 2026-05-26

A user benchmarks RTX 5090 and RTX 6000 PRO GPUs for AI diffusion tasks, comparing performance at different power limits and showing tradeoffs between speed and power consumption.

0 favorites 0 likes

#power-efficiency

Finding the 4x 3090 Sweet Spot

Reddit r/LocalLLaMA ↗ · 2026-05-15

A user shares power limit testing on a 4x RTX 3090 setup running Qwen3.6-27B with vLLM, finding 220W as the sweet spot for peak efficiency with minimal throughput loss.

0 favorites 0 likes

#power-efficiency

[Benchmark] 5090RTX: Promt Parsing, Token Generation and Power Level

Reddit r/LocalLLaMA ↗ · 2026-05-14

A user benchmarks the Nvidia 5090 RTX GPU for LLM inference using llama.cpp, measuring prompt processing and token generation at various power levels, finding that prompt processing is more sensitive to power limits than token generation, and noting differences from the 4090 RTX.

0 favorites 0 likes

#power-efficiency

Dual dgx spark (Asus GX10) MiniMax M2.7 results

Reddit r/LocalLLaMA ↗ · 2026-04-21

User benchmarks dual Asus GX10 (DGX Spark) running MiniMax-M2.7-AWQ-4bit, achieving 30–40 tokens/s while drawing only ~100 W each, replacing noisy multi-GPU rigs.

0 favorites 0 likes

power-efficiency

Tiny LLM Benchmark: Jetson Orin Nano Super 8GB - Four Power Modes × Eight Models

Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).

Finding the 4x 3090 Sweet Spot

[Benchmark] 5090RTX: Promt Parsing, Token Generation and Power Level

Dual dgx spark (Asus GX10) MiniMax M2.7 results

Submit Feedback