@charles_irl: ^That's a sample of CuTe DSL, which is used in, among others, the FlashAttention-4 kernel. Below is the sample CuTe ker…

X AI KOLs Following Tools

Summary

A tweet showcasing a CuTe DSL kernel sample that uses layouts to express transposition, part of the FlashAttention-4 kernel.

^That's a sample of CuTe DSL, which is used in, among others, the FlashAttention-4 kernel. Below is the sample CuTe kernel, with a cute trick: using layouts to express transposition. https://modal.com/notebooks/modal-labs/examples/nb-owEUD0kdSVeL4KeEX5sjh1…
Original Article

Similar Articles