Why can't people just run gemini and claude code using their own gpus?

Reddit r/artificial 05/23/26, 10:53 AM News

gpu compute-cost ai-limitations gemini claude local-inference

Summary

A commentary questioning why users cannot run Gemini and Claude Code locally on their own GPUs, implying compute cost constraints are limiting access to these AI models.

It looks like Gemini and Claude Code has been either heavily downgraded or limited, due to lack of or high cost of compute. Why can't people and engineers run the ai's using their own gpu's that are sitting idle in their pcs?

Original Article

Similar Articles

Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?

Hacker News Top

A Hacker News discussion explores whether developers can replace cloud AI models like Claude with local models for daily coding. Participants share experiences, noting that local models (e.g., Qwen, Gemma) are viable for hobbyists but still lag behind top cloud models for professional use.

You don't need a GPU to run gemma-4-26B-A4B

Reddit r/LocalLLaMA

The author demonstrates that the Gemma-4-26B-A4B model runs efficiently on a CPU-only system using Koboldcpp, achieving 7 tokens per second on an old desktop, suggesting that powerful GPUs may not be necessary for local LLM inference.

What impedes apps using AI to make the user’s device the server running a local LLM?

Reddit r/singularity

A user reflects on why more apps don’t run local LLMs directly on phones, noting Gemma 2-4B models already work offline and could eliminate server costs while maintaining near-GPT-4o quality.

The GPUless Revolution: How Efficient AI Models Are Democratizing Artificial Intelligence

Reddit r/AI_Agents

A quiet revolution is making powerful AI models runnable on consumer hardware without expensive GPUs, thanks to breakthroughs in quantization and optimized implementations like llama.cpp's Gemma4 MTP support, democratizing access for hobbyists, small businesses, and edge computing.

@GergelyOrosz: A few days ago, Steve posted about how AI usage is low at Google is surprisingly low, in good part because Gemini is ju…

X AI KOLs Following

Internal resistance and policy restrictions limit Google's adoption of its own Gemini model, with employees preferring disallowed tools like Claude Code.