Built a tool that tells you exactly which LLMs fit on your GPU. Feedback wanted.
Summary
A tool that estimates which LLMs fit on a user's GPU memory, ranking models by performance while considering memory constraints and quantization levels.
View Cached Full Text
Cached at: 06/12/26, 11:02 PM
Similar Articles
@oliviscusAI: Someone just built a tool that tells you exactly which LLMs will run on your hardware. it scans your ram, cpu, and gpu,…
A new tool has been released that scans a user's hardware specifications (RAM, CPU, GPU) to determine which Large Language Models can run locally, ranking them by performance metrics.
@Sumanth_077: Stop guessing which models fit in your VRAM! llmfit is a CLI tool that auto-detects your hardware and ranks 206 models …
llmfit is an open-source CLI tool that detects your hardware and ranks over 200 LLMs by which ones will actually run on your system, automatically choosing the best quantization that fits.
Show HN: Find the best local LLM for your hardware, ranked by benchmarks
whichllm is an open-source Python tool that auto-detects your GPU/CPU/RAM and ranks the best local LLMs from HuggingFace that fit your system, using real benchmarks rather than size heuristics.
@tom_doerr: Runs 70B LLMs on single 4GB GPU https://github.com/lyogavin/airllm
AirLLM is an open-source tool that optimizes inference memory usage, enabling 70B LLMs to run on a single 4GB GPU without quantization, and supports 405B models on 8GB VRAM.
We have built the first of it's kind interactive blog for matching open-source LLMs to GPUs.
AgentSwarms launched an interactive, gamified blog that helps users match open-source LLMs to the right GPU by calculating VRAM requirements based on model size and quantization, turning infrastructure planning into an engaging experience.