self-attention

#self-attention

Elastic Attention Cores for Scalable Vision Transformers [R]

Reddit r/MachineLearning ↗ · 4h ago

This article presents a new paper on Elastic Attention Cores for Vision Transformers, proposing a core-periphery block-sparse attention structure that improves scalability and accuracy compared to dense self-attention methods like DINOv3.

0 favorites 0 likes

#self-attention

Temporal Attention for Adaptive Control of Euler-Lagrange Systems with Unobservable Memory

arXiv cs.LG ↗ · 2d ago Cached

This paper proposes a meta-control architecture using temporal self-attention for adaptive control of Euler-Lagrange systems with unobservable memory states. It demonstrates improved tracking performance over baseline methods on a 2-DOF manipulator while identifying failure modes in long-memory regimes.

0 favorites 0 likes

#self-attention

@ickma2311: Efficient AI Lecture 12: Transformer and LLM This lecture is not only about how LLMs work. It also explains the buildin…

X AI KOLs Timeline ↗ · 4d ago Cached

Lecture notes from an Efficient AI course covering Transformer and LLM fundamentals, including multi-head attention, positional encoding, KV cache, and the connection between model architecture and inference efficiency. The content explains how design choices in transformers affect memory, latency, and hardware efficiency.

0 favorites 0 likes

self-attention

Elastic Attention Cores for Scalable Vision Transformers [R]

Temporal Attention for Adaptive Control of Euler-Lagrange Systems with Unobservable Memory

@ickma2311: Efficient AI Lecture 12: Transformer and LLM This lecture is not only about how LLMs work. It also explains the buildin…

Submit Feedback