Tag
A daily GPU programming thread highlights a talk by Charles_irl that reverse-engineers FlashAttention4 code before the paper release, praising the Modal team's deep code dissection and inferences about the forward pass.