Tag
Improves prefill speeds for k-quants and refactors matrix multiplication for Q4/Q5/Q8 and k-quants in llama.cpp's WebGPU backend.