Tag
A tweet criticizes current LLM architecture for wasteful recomputation due to order-dependent context, and proposes encoding context units separately to enable order-invariant, efficient caching and generation.