efficient-attention

Tag

Cards List
#efficient-attention

Training-Inference Consistent Segmented Execution for Long-Context LLMs

arXiv cs.CL · 12h ago Cached

This paper proposes a training-inference consistent segmented execution framework for long-context LLMs to address the mismatch between full-context training and restricted inference regimes, achieving comparable performance with significantly reduced memory usage.

0 favorites 0 likes
#efficient-attention

@omarsar0: Cool idea from Nous Research. What if you could speed up long-context pretraining with a subquadratic wrapper that you …

X AI KOLs Following · yesterday Cached

Nous Research introduces Lighthouse Attention, a training-only subquadratic wrapper for scaled dot-product attention that accelerates long-context pretraining and can be removed before deployment to preserve vanilla inference efficiency.

0 favorites 0 likes
#efficient-attention

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

arXiv cs.LG · 2d ago Cached

This paper introduces Toeplitz MLP Mixers (TMM), a novel architecture that replaces attention with Toeplitz matrix multiplication to achieve lower computational complexity while maintaining high information retention and training efficiency.

0 favorites 0 likes
← Back to home

Submit Feedback