Tag
A GitHub repository called club-3090 provides recipes and configs for serving large language models locally on RTX 3090 GPUs, with support for multiple engines and quantization methods like Dflash and TurboQuant, including newly unlocked Q5 quants.