Tag
A user describes their fully local AI stack using multiple hardware devices running Chinese models like GLM, Qwen, and Kimi, claiming 87% cost savings compared to frontier models like GPT-5.5 and Opus 4.8, while noting plans to self-host video generation.
Discusses running a Q6 quantized version of the Gemma 4 31B model on a dual 9060 XT GPU configuration, likely for local inference.