lecture-notes

#lecture-notes

@ickma2311: Efficient AI Lecture 12: Transformer and LLM This lecture is not only about how LLMs work. It also explains the buildin…

X AI KOLs Timeline ↗ · 18h ago Cached

Lecture notes from an Efficient AI course covering Transformer and LLM fundamentals, including multi-head attention, positional encoding, KV cache, and the connection between model architecture and inference efficiency. The content explains how design choices in transformers affect memory, latency, and hardware efficiency.

0 favorites 0 likes

lecture-notes

@ickma2311: Efficient AI Lecture 12: Transformer and LLM This lecture is not only about how LLMs work. It also explains the buildin…

Submit Feedback