flash-attention-2

Tag

Cards List
#flash-attention-2

Anyone using Flash Attention 2 (ai-bond) on their V100's? How is the performance?

Reddit r/LocalLLaMA · 2026-05-29

A user benchmarks a V100-compatible port of Flash Attention 2, reporting 3x-17x speedups and up to 94% memory reduction over default PyTorch attention.

0 favorites 0 likes
← Back to home

Submit Feedback