Why can't people just run gemini and claude code using their own gpus?

Reddit r/artificial News

Summary

A commentary questioning why users cannot run Gemini and Claude Code locally on their own GPUs, implying compute cost constraints are limiting access to these AI models.

It looks like Gemini and Claude Code has been either heavily downgraded or limited, due to lack of or high cost of compute. Why can't people and engineers run the ai's using their own gpu's that are sitting idle in their pcs?
Original Article

Similar Articles

Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?

Hacker News Top

A Hacker News discussion explores whether developers can replace cloud AI models like Claude with local models for daily coding. Participants share experiences, noting that local models (e.g., Qwen, Gemma) are viable for hobbyists but still lag behind top cloud models for professional use.

You don't need a GPU to run gemma-4-26B-A4B

Reddit r/LocalLLaMA

The author demonstrates that the Gemma-4-26B-A4B model runs efficiently on a CPU-only system using Koboldcpp, achieving 7 tokens per second on an old desktop, suggesting that powerful GPUs may not be necessary for local LLM inference.