Tag
Introduces Janus, a plug-in memory controller for LLMs that selectively accepts or rejects candidate memory updates using a Memory Momentum Trigger and a compact hybrid evaluation set, improving average accuracy by +2.7 to +4.6 points across multiple datasets.
Proposes MGUP, a momentum-gradient alignment update policy for selective intra-layer parameter updates in stochastic optimization, which integrates with optimizers like AdamW, Lion, and Muon, and provides theoretical convergence guarantees along with superior performance on large-scale model training tasks.