random-fourier-features

#random-fourier-features

Flexformer: Flexible Linear Transformer with Learnable Attention Kernel

arXiv cs.LG ↗ · yesterday Cached

Flexformer proposes a flexible linear Transformer with fully learnable attention kernels using random Fourier features, achieving linear complexity while matching or exceeding softmax attention performance on language modeling and sequence classification tasks.

0 favorites 0 likes

random-fourier-features

Flexformer: Flexible Linear Transformer with Learnable Attention Kernel

Submit Feedback