Tag
Sebastian Raschka reviews recent innovations in LLM architectures focused on long-context efficiency, including KV sharing, compressed convolutional attention, and layer-wise attention budgeting from models like Gemma 4, ZAYA1, Laguna XS.2, and DeepSeek V4.
This paper proposes a two-dimensional classification framework for AI agent design patterns that combines cognitive function and execution topology axes, identifying 27 named patterns and deriving empirical laws from cross-domain analysis.