Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Hugging Face Daily Papers 05/15/26, 12:00 AM Papers

ai-agents neural-architecture-search foundation-models transformer mamba self-improvement

Summary

This paper introduces AIRA-Compose and AIRA-Design, dual frameworks using AI agents to autonomously discover neural architectures that outperform standard Transformers and scale efficiently.

Toward recursive self-improvement, we investigate LLM agents autonomously designing foundation models beyond standard Transformers. We introduce a dual-framework approach: AIRA-Compose for high-level architecture search, and AIRA-Design for low-level mechanistic implementation. AIRA-Compose uses 11 agents to explore fundamental computational primitives under a 24-hour budget. Agents evaluate million-parameter candidates, extrapolating top designs to 350M, 1B, and 3B scales. This yields 14 architectures across two families: AIRAformers (Transformer-based) and AIRAhybrids (Transformer-Mamba). Pre-trained at 1B scale, these consistently outperform Llama 3.2 and Composer-found baselines. On downstream tasks, AIRAformer-D and AIRAhybrid-D improve accuracy by 2.4% and 3.8% over Llama 3.2. Furthermore, AIRA-Compose finds models with highly efficient scaling frontiers: AIRAformer-C scales 54% and 71% faster than Llama 3.2 and Composer's best Transformer, while AIRAhybrid-C outscales Nemotron-2 by 23% and Composer's best hybrid by 37%. AIRA-Design tasks 20 agents with writing novel attention mechanisms for long-range dependencies and high-performing training scripts. On the Long Range Arena benchmark, agent-designed architectures reach within 2.3% and 2.6% of human state-of-the-art on document matching and text classification. On the Autoresearch benchmark, Greedy Opus 4.5 achieves 0.968 validation bits-per-byte under a fixed time budget, surpassing the published minimum. Together, these frameworks show AI agents can autonomously discover architectures and algorithmic optimizations matching or surpassing hand-designed baselines. This establishes a powerful paradigm for discovering next-generation foundation models, marking a clear step toward recursive self-improvement.

Original Article

View Cached Full Text

Cached at: 05/18/26, 02:23 AM

Paper page - Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Source: https://huggingface.co/papers/2605.15871

Abstract

AI agents autonomously design foundation models exceeding standard Transformers through dual frameworks that optimize both architectural search and mechanistic implementation, achieving superior performance and efficiency.

Toward recursive self-improvement, we investigateLLM agentsautonomously designingfoundation modelsbeyond standard Transformers. We introduce a dual-framework approach:AIRA-Composefor high-levelarchitecture search, andAIRA-Designfor low-levelmechanistic implementation.AIRA-Composeuses 11 agents to explore fundamental computational primitives under a 24-hour budget. Agents evaluate million-parameter candidates, extrapolating top designs to 350M, 1B, and 3B scales. This yields 14 architectures across two families: AIRAformers (Transformer-based) and AIRAhybrids (Transformer-Mamba). Pre-trained at 1B scale, these consistently outperform Llama 3.2 and Composer-found baselines. Ondownstream tasks, AIRAformer-D and AIRAhybrid-D improve accuracy by 2.4% and 3.8% over Llama 3.2. Furthermore,AIRA-Composefinds models with highly efficientscaling frontiers: AIRAformer-C scales 54% and 71% faster than Llama 3.2 and Composer’s best Transformer, while AIRAhybrid-C outscales Nemotron-2 by 23% and Composer’s best hybrid by 37%.AIRA-Designtasks 20 agents with writing novelattention mechanismsfor long-range dependencies and high-performing training scripts. On theLong Range Arenabenchmark, agent-designed architectures reach within 2.3% and 2.6% of human state-of-the-art on document matching and text classification. On theAutoresearchbenchmark, Greedy Opus 4.5 achieves 0.968 validationbits-per-byteunder a fixed time budget, surpassing the published minimum. Together, these frameworks show AI agents can autonomously discover architectures and algorithmic optimizations matching or surpassing hand-designed baselines. This establishes a powerful paradigm for discovering next-generationfoundation models, marking a clear step toward recursive self-improvement.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.15871

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.15871 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.15871 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.15871 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Paper page - Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

@Kangwook_Lee: https://x.com/Kangwook_Lee/status/2052925157606568217

@anyscalecompute: Most agent frameworks solve orchestration and leave infrastructure completely unresolved. New blog: production-ready AI…

Neurodata Without Boredom: Benchmarking Agentic AI for Data Reuse

@dair_ai: https://x.com/dair_ai/status/2053495521243799717

After using AI agents for a few months, these are my biggest observations

Submit Feedback

Similar Articles

@Kangwook_Lee: https://x.com/Kangwook_Lee/status/2052925157606568217

@anyscalecompute: Most agent frameworks solve orchestration and leave infrastructure completely unresolved. New blog: production-ready AI…

Neurodata Without Boredom: Benchmarking Agentic AI for Data Reuse

@dair_ai: https://x.com/dair_ai/status/2053495521243799717

After using AI agents for a few months, these are my biggest observations