LLM planner - pick a rig for your use-case/model/budget, or pick models for your rig. 60+ builds, 50+ models, 130+ cited t/s sources, 150+ reviewer YouTube videos, idle+active watts, multi-region prices, regular updates.

Reddit r/LocalLLaMA Tools

Summary

A comprehensive web tool and public dataset that helps users choose the right hardware for running LLMs, featuring 60+ builds, 50+ models, performance benchmarks, and reviewer videos, with two-way matching between models and hardware.

TL;DR: Sourced internet info into llm model/hardware choise guide. Two directions: * "What rig should I buy for use-case / model /budget?" * "I have a 3090 / M3 Max / DGX Spark / Strix Halo / R9700. What runs well on it?" Plus a side-by-side compare mode for rigs and LLMs. Tokens/sec numbers cite a source; every build links the actual reviewer YouTube videos. Why I built it: Needed to pick what I buy, 5090, spark or strix halo. Ended up with spark made by asus. I was building it to the point where I didnt need to exit the site and go google something. What's actually in it: * 60+ specific build configs of all sorts, plug and play on-off switch, datacentre on-off switch * Decode tok/s + prompt-processing tok/s at Q2/Q4/Q5/Q8 per model * 100K promt processing time to the first token * Idle + active power draw in watts * Used + new prices, multi-region * 150+ reviewer YouTube videos linked across the builds (so you can watch the review to make an opinion) * 130 cited sources across leaderboards, model cards, llama.cpp benchmark threads, Tom's Hardware, and this sub * Reverse mode: paste hardware -> see open-weights that fit, ranked across chat/coding/agents/reasoning, with the closed-frontier four (Gemini 3.1 Pro, GPT-5.5, Sonnet 4.6, Opus 4.7) shown as ceiling reference * Data is updated at least once a week. What it does NOT do: * Gives a link to a cheapest price in your region. * Gives absolute best tps for can get for your hardware. Mileage may vary based on quant/software, patches and updates Link: [https://llmrequirements.com](https://llmrequirements.com) All the data is exported into public repo [https://github.com/Trenin-Labs/LlmRequirements](https://github.com/Trenin-Labs/LlmRequirements) There's link on the website to submit benchmark or report inaccuracies using issues on github for this public repo.
Original Article

Similar Articles

Inference Engines for LLMs & Local AI Hardware (2026 Edition)

X AI KOLs

This article provides a comprehensive guide to LLM inference engines for local AI hardware in 2026, explaining how to choose based on hardware strategy, workload, and serving model, and covering engines like llama.cpp, MLX, ExLlamaV2/3, vLLM, SGLang, TensorRT-LLM, and NVIDIA Dynamo.