Built a tool that tells you exactly which LLMs fit on your GPU. Feedback wanted.

Reddit r/LocalLLaMA 06/12/26, 09:03 PM Tools

llm-compatibility gpu-memory model-rankings quantization local-llm hardware-compatibility

Summary

A tool that estimates which LLMs fit on a user's GPU memory, ranking models by performance while considering memory constraints and quantization levels.

I built [llmjob.com/rankings.html](https://llmjob.com/rankings.html) to pick your GPU and it shows which open-weight models actually fit, ranked by quality and context. No more guessing if a model will fit your VRAM. Looking for some feedback on what details are actually useful.

Original Article

View Cached Full Text

Cached at: 06/12/26, 11:02 PM

# LLMJob — What to run on your hardware Source: [https://llmjob.com/rankings.html](https://llmjob.com/rankings.html) Pick your hardware\. Models are ranked by[Artificial Analysis](https://artificialanalysis.ai/)Intelligence Index v4\.0 — only configs that actually fit your memory at full native context\. Fit is**estimated**from memory — nothing has been benchmarked on the network yet\. Scores are full\-precision \(API\-measured\)\. \#ModelAA Score% FrontierQuantContextVRAM Ranked by AA score adjusted for estimated quantization loss, at the best quant that fits at full native context\. Q2/Q3 rows lose meaningful quality\. Models that don't fit aren't shown\. Click a row to see every quant that fits\. ## Proprietary frontier API\-only For context: the closed models you can't run locally\. The % frontier column above is relative to Claude Fable 5 at 64\.9\. \#ModelLabAA Score% Frontier1Claude Fable 5Anthropic64\.9100%2Claude Opus 4\.8Anthropic61\.495%3GPT\-5\.5 \(xhigh\)OpenAI60\.293%4GPT\-5\.5 \(high\)OpenAI58\.991%5Claude Opus 4\.7Anthropic57\.388%6Gemini 3\.1 Pro PreviewGoogle57\.288%7GPT\-5\.4 \(xhigh\)OpenAI56\.888%8GPT\-5\.5 \(medium\)OpenAI56\.787%9Gemini 3\.5 Flash \(high\)Google55\.385%10Gemini 3\.5 Flash \(medium\)Google54\.884% Scores:[Artificial Analysis](https://artificialanalysis.ai/)Intelligence Index v4\.0 \(10 evals incl\. GDPval\-AA, Terminal\-Bench Hard, SciCode, GPQA Diamond, Humanity's Last Exam\)\. VRAM figures include weights \+ KV cache at the listed context\. Last updated 2026\-06\-11\.

Built a tool that tells you exactly which LLMs fit on your GPU. Feedback wanted.

Similar Articles

@oliviscusAI: Someone just built a tool that tells you exactly which LLMs will run on your hardware. it scans your ram, cpu, and gpu,…

@Sumanth_077: Stop guessing which models fit in your VRAM! llmfit is a CLI tool that auto-detects your hardware and ranks 206 models …

Show HN: Find the best local LLM for your hardware, ranked by benchmarks

@tom_doerr: Runs 70B LLMs on single 4GB GPU https://github.com/lyogavin/airllm

We have built the first of it's kind interactive blog for matching open-source LLMs to GPUs.

Submit Feedback

Similar Articles

@oliviscusAI: Someone just built a tool that tells you exactly which LLMs will run on your hardware. it scans your ram, cpu, and gpu,…

@Sumanth_077: Stop guessing which models fit in your VRAM! llmfit is a CLI tool that auto-detects your hardware and ranks 206 models …

Show HN: Find the best local LLM for your hardware, ranked by benchmarks

@tom_doerr: Runs 70B LLMs on single 4GB GPU https://github.com/lyogavin/airllm

We have built the first of it's kind interactive blog for matching open-source LLMs to GPUs.