Tag
This article analyzes Jan Chorowski's BDH architecture proposal, which explores embedding LLM memory directly into network weights using sparse high-dimensional key-query spaces as an alternative to traditional KV caches.