Tag
This pull request adds a fast Walsh-Hadamard transform implementation for CUDA in llama.cpp, a popular open-source LLM inference engine. The optimization enhances performance for certain computational operations on NVIDIA GPUs.