Tag
A guide on optimizing VRAM usage on an AMD 7900XTX to run a 27B Qwen model with Q6K quantization and 131k context by compiling llama.cpp with OpenBLAS and CUDA_FA_ALL_QUANTS, and using kvcache quantization at q5_0/q4_0.
MachinaCheck is a multi-agent AI system built on AMD MI300X hardware that automates CNC manufacturability analysis for STEP files using Qwen 2.5 7B models.