Tag
This post presents benchmark results and tuning parameters for running DiffusionGemma 26B A4B GGUF models on an RTX 5090 GPU, showing up to 44% speedup via optimized temperature settings and quantization choices.
Discussion comparing Gemma4 12b and 26a4b variants, focusing on creative tasks like writing and chatting.
Super Gemma 4 26B Uncensored GGUF v2 is a community fine-tuned model offering uncensored responses with zero refusals, improved speed, and fixed tool-calling, optimized for local inference on llama.cpp and vLLM.