arc-agi

Tag

Cards List
#arc-agi

Useful Memories Become Faulty When Continuously Updated by LLMs

arXiv cs.AI · 3h ago Cached

This paper shows that continuously consolidating past experiences into textual memory using LLMs degrades memory utility over time, and that preserving raw episodic trajectories outperforms forced consolidation, with implications for robust agentic memory systems.

0 favorites 0 likes
#arc-agi

Seed IQ-ARC AGI 3: Special behind-the-scenes look at Seed IQ on ARC-AGI 3 games! 14/14 games with a perfect 100% score across all.

Reddit r/ArtificialInteligence · 14h ago

Seed IQ achieves a perfect 14/14 score on ARC-AGI-3 games using an active inference, physics-driven multi-agent autonomous control engine, as shown in a behind-the-scenes video walkthrough.

0 favorites 0 likes
#arc-agi

Useful Memories Become Faulty When Continuously Updated by LLMs

Hugging Face Daily Papers · yesterday Cached

A study finds that continuously updating consolidated memories in LLM-based agentic systems degrades performance, and that retaining raw episodic trajectories is more reliable. Experiments on ARC-AGI show that even GPT-5.4 fails more often after consolidation.

0 favorites 0 likes
#arc-agi

@dylan_works_: Wrote up something fun I’ve been poking at: when LLM agents repeatedly rewrite their own experiences into textual “less…

X AI KOLs Timeline · 4d ago Cached

This research blog post demonstrates that repeatedly rewriting LLM agent experiences into textual 'lessons' often degrades performance rather than improving it. The author finds that episodic memory retention performs better than abstract consolidation across various benchmarks like ARC-AGI and ALFWorld.

0 favorites 0 likes
#arc-agi

11.67% ARC-AGI-2 Local Eval on a Single 4090: The TOPAS Recursive Architecture

Reddit r/LocalLLaMA · 6d ago

The authors present TOPAS, a recursive AI architecture achieving 11.67% on ARC-AGI-2 using a single RTX 4090, aiming to demonstrate that architectural efficiency can outweigh raw compute power.

0 favorites 0 likes
← Back to home

Submit Feedback