Tag
MoonshotAI released FlashKDA, open-source CUTLASS kernels for Kimi Delta Attention that deliver up to 2.22x speedup over Triton on H20 GPUs.