Tag
Pull request adds optimized x86 and generic CPU q1_0 dot-product kernels to ggml-cpu, improving quantized LLM inference speed.
Research on optimizing 2D graphics rendering on CPUs using sparse strip techniques to improve performance and reduce memory overhead.