post-transformer-models

#post-transformer-models

The interesting BDH question: What if LLM memory lived in the network weights instead of the ever-growing KV cache?

Reddit r/singularity ↗ · 2026-05-11

This article analyzes Jan Chorowski's BDH architecture proposal, which explores embedding LLM memory directly into network weights using sparse high-dimensional key-query spaces as an alternative to traditional KV caches.

0 favorites 0 likes

post-transformer-models

The interesting BDH question: What if LLM memory lived in the network weights instead of the ever-growing KV cache?

Submit Feedback