248k-context

#248k-context

@leopardracer: GEMMA 4 26B ON AN RTX 4060 WITH A 248K TOKEN CONTEXT WINDOW 20 tokens per second and a context window so large you can …

X AI KOLs Timeline ↗ · 3d ago Cached

Gemma 4 26B runs on an RTX 4060 with 248K token context at 20 tokens per second using llama.cpp and Q4_K_XL quantization, enabling local processing of entire codebases on consumer hardware.

0 favorites 0 likes

248k-context

@leopardracer: GEMMA 4 26B ON AN RTX 4060 WITH A 248K TOKEN CONTEXT WINDOW 20 tokens per second and a context window so large you can …

Submit Feedback