agentic-systems

#agentic-systems

@dair_ai: https://x.com/dair_ai/status/2053495521243799717

X AI KOLs Following ↗ · 2h ago Cached

DAIR AI's weekly roundup highlights top research papers including HeavySkill, which improves model performance via internalized parallel reasoning, and Sakana AI's Conductor, which uses RL to optimize agent orchestration. It also covers Meta FAIR's work on self-improving pretraining.

0 favorites 0 likes

#agentic-systems

Measuring information density in web pages from an LLM agent's perspective [R]

Reddit r/MachineLearning ↗ · yesterday

This paper presents empirical measurements of information density in web pages from the perspective of LLM agents, using a curated benchmark of 100 URLs across five categories. It finds that structural extraction reduces token count by an average of 71.5% while preserving answer quality, and reveals an undocumented compression layer in Claude Code.

0 favorites 0 likes

#agentic-systems

@rwayne: Yesterday an interesting paper dropped on arXiv that directly translates the 'consciousness' mechanism from cognitive science into long-context engineering.

X AI KOLs Timeline ↗ · 2d ago

Researchers propose applying the "global ignition" consciousness mechanism from cognitive science to long-context engineering, introducing the MiA-Signature method that uses submodular selection of high-level concepts to cover the activation space. Applied to RAG and agentic systems, it delivers consistent performance improvements across multiple long-context tasks.

0 favorites 0 likes

#agentic-systems

@omarsar0: Karpathy's autoresearch repo started an impressive trend. Agents can now train AI models to build SoTA agentic systems.…

X AI KOLs Following ↗ · 2026-04-21 Cached

Karpathy's autoresearch repository has sparked a trend where agents train AI models to build state-of-the-art agentic systems, highlighting current limitations in LLM-driven hypothesis generation.

0 favorites 0 likes

#agentic-systems

EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems

arXiv cs.CL ↗ · 2026-04-20 Cached

EvoTest introduces J-TTL, a benchmark for measuring agent test-time learning capabilities, and proposes an evolutionary framework where an Actor Agent plays games while an Evolver Agent iteratively improves the system's prompts, memory, and hyperparameters without fine-tuning. The method demonstrates superior performance compared to reflection and memory-based baselines on complex text-based games.

0 favorites 0 likes

#agentic-systems

AgentV-RL: Scaling Reward Modeling with Agentic Verifier

arXiv cs.CL ↗ · 2026-04-20 Cached

AgentV-RL introduces an Agentic Verifier framework that enhances reward modeling through bidirectional verification with forward and backward agents augmented with tools, achieving 25.2% improvement over state-of-the-art ORMs. The approach addresses error propagation and grounding issues in verifiers for complex reasoning tasks through multi-turn deliberative processes combined with reinforcement learning.

0 favorites 0 likes

#agentic-systems

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

Hugging Face Daily Papers ↗ · 2026-04-14 Cached

This paper analyzes Claude Code's architecture as an agentic coding tool, identifying five human values and thirteen design principles that inform its implementation, including safety systems, context management, and extensibility mechanisms. The study compares Claude Code with OpenClaw to demonstrate how different deployment contexts lead to different architectural solutions for common AI agent design challenges.

0 favorites 0 likes

#agentic-systems

Netomi’s lessons for scaling agentic systems into the enterprise

OpenAI Blog ↗ · 2026-01-08 Cached

Netomi shares lessons from scaling agentic AI systems in enterprise environments, leveraging GPT-4.1 and GPT-5.2 within a governed execution layer to handle complex, multi-step workflows for Fortune 500 clients like United Airlines and DraftKings. The company demonstrates how proper prompting patterns, concurrency design, and contextual reasoning enable reliable AI agent deployment at production scale.

0 favorites 0 likes

agentic-systems

Submit Feedback