memory-hierarchy

#memory-hierarchy

@che_shr_cat: 1/ We have been treating GPU memory all wrong. What if the GPU didn't need to store your model at all? MegaTrain enable…

X AI KOLs Timeline ↗ · yesterday Cached

MegaTrain enables full-precision training of 100B+ LLMs on a single GPU by treating VRAM as a transient stateless cache, inverting the memory hierarchy.

0 favorites 0 likes

#memory-hierarchy

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

Hugging Face Daily Papers ↗ · 2026-06-16 Cached

OPD-Evolver proposes a self-evolving agent framework using slow-fast co-evolution and on-policy self-distillation to enhance memory management and policy learning, outperforming existing methods like ReasoningBank and Skill0 across multi-domain benchmarks.

0 favorites 0 likes

#memory-hierarchy

Should worker agents write memory directly? A curator-agent pattern I am testing

Reddit r/AI_Agents ↗ · 2026-05-24

The author describes a pattern where worker agents emit structured memory events instead of writing directly to shared memory, using a Memory Curator to validate, deduplicate, and route them to appropriate scopes, aiming to prevent memory pollution in multi-agent systems. They compare this approach to existing frameworks and solicit community feedback.

0 favorites 0 likes

#memory-hierarchy

KV Cache Is Becoming the Memory Hierarchy of Inference

Hacker News Top ↗ · 2026-05-17 Cached

The article discusses how the KV cache is evolving into a memory hierarchy for LLM inference, optimizing memory management during decoding.

0 favorites 0 likes

#memory-hierarchy

TTKV: Temporal-Tiered KV Cache for Long-Context LLM Inference

arXiv cs.CL ↗ · 2026-04-23 Cached

TTKV introduces a temporal-tiered KV cache that mimics human memory to cut 128K-context LLM inference latency by 76% and double throughput while reducing cross-tier traffic 5.94×.

0 favorites 0 likes

memory-hierarchy

@che_shr_cat: 1/ We have been treating GPU memory all wrong. What if the GPU didn't need to store your model at all? MegaTrain enable…

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

Should worker agents write memory directly? A curator-agent pattern I am testing

KV Cache Is Becoming the Memory Hierarchy of Inference

TTKV: Temporal-Tiered KV Cache for Long-Context LLM Inference

Submit Feedback