@andrewchen: finding the main downside with experimenting with local AI models is that you end up buying one GPU, then another, then…
Summary
Andrew Chen shares his experience of buying multiple GPUs for local AI experimentation, running Qwen3.6 27B dense at 100 tok/s on a 5090 eGPU, and compares it to Sonnet 4.6.
View Cached Full Text
Cached at: 05/19/26, 02:42 AM
finding the main downside with experimenting with local AI models is that you end up buying one GPU, then another, then another, then another…
But I’m running qwen3.6 27b dense at 100 tok/s now on a 5090 eGPU! It feels like sonnet 4.6? Fast and highly usable
I figure the GPUs I have will now increase in value over the next few years so it’ll all be worth it
Similar Articles
@leopardracer: https://x.com/leopardracer/status/2055341758523883631
A user shares their experience setting up a dual-GPU local AI lab with RTX 4080 Super and 5060 Ti, running Qwen 3.6 models via llama.cpp and llama-swap to reduce API costs and enable unrestricted experimentation.
@TheAhmadOsman: Gentle reminder that all you need to start with Local AI is: - 2x RTX 3090s (pick up for $700-$900 on r/hardwareswap) -…
A reminder that two RTX 3090s and open-source models like Qwen 3.6 27B or Gemma 4 31B can run powerful local AI agents, comparable to Opus 4.5, using tools like Claude Code and self-hosted SearXNG.
@guohao_li: yes, it is definitely time to seriously consider buying more GPUs and start building our own local ai stack. i’m curiou…
A researcher suggests it's time to buy more GPUs and build a local AI stack, referencing Qwen 3.5 27B and GLM 5.2 as models that cancel the threat of a permanent underclass.
@DeRonin_: My current local AI setup: - 2x DGX Spark linked (256gb) > GLM 5.2 @ 2bit, reasoning + agent loops - Mac Studio M3 Ultr…
A user describes their fully local AI stack using multiple hardware devices running Chinese models like GLM, Qwen, and Kimi, claiming 87% cost savings compared to frontier models like GPT-5.5 and Opus 4.8, while noting plans to self-host video generation.
@davis7: @0xSero helped me setup local models properly and I uh, had no idea these things had gotten this good Are they frontier…
The author highlights the impressive capabilities of the open-source Qwen 3.6-27B model running locally on an RTX 5090, noting its strong performance on programming tasks and comparing it favorably to commercial models, despite the complexity of local deployment.