Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application
Summary
A comprehensive survey on agentic environment engineering for LLMs, covering environment modeling, synthesis, evaluation, and application, with a focus on agent-environment co-evolution.
View Cached Full Text
Cached at: 06/11/26, 01:40 PM
Paper page - Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application
Source: https://huggingface.co/papers/2606.12191 Authors:
,
,
,
,
,
,
,
,
,
,
,
Abstract
Large language model agents require specialized environments for training and evaluation, which can be categorized by their engineering lifecycle stages and evolved through various paradigms including neural and symbolic approaches.
Environments serve as interactive systems forlarge language model(LLM) based agents across diverse scenarios and play a crucial role in driving the continual evolution of model capabilities. Despite this importance, existing work lacks a systematic categorization and deep analysis. This paper systematically studies current researches onagentic environmentsfrom the perspective of theenvironment engineering lifecycle, covering their modeling, synthesis, evaluation and application. Specifically, the paper first introduces representative environments from the perspectives of eight attributes and eight domains, providing detailed analyses of their development paths and highlighting their core capabilities. Second, for automated environment synthesis, two paradigms are introduced, such assymbolic synthesisandneural synthesis. This paper also shows different environment evaluation methods in each paradigm. Thirdly, the corresponding environment applications from the perspective ofagent-environment co-evolutionare discussed. In specific, the paper characterizes the primary pathways for agent evolution in dynamic environments from four complementary perspectives:memory-centric experience evolution,orchestration-centric workflow evolution,trajectory-centric offline evolution, andexploration-centric online evolution. And three paradigms of environment evolution are identified, namely neural-driven, difficulty-driven, and scaling-driven approaches. At last, several promising future directions are discussed, including Environment-as-a-Service, Multi-agent Environments, and Neural-Symbolic Environments.
View arXiv pageView PDFAdd to collection
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.12191 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.12191 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.12191 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
EnvSimBench: A Benchmark for Evaluating and Improving LLM-Based Environment Simulation
This paper introduces EnvSimBench, a benchmark for evaluating Large Language Models' ability to simulate environments for agent training. It identifies a 'state change cliff' in current LLMs and proposes a constraint-driven pipeline to reduce hallucinations and costs.
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis
EnvScaler is an automated framework for scaling tool-interactive environments for LLM agents through programmatic synthesis, creating 191 diverse environments and 7K scenarios to improve agent performance on multi-turn, multi-tool interactions.
Greener Than Humans? Environmental Attitudes in Large Language Models
This paper develops a benchmark for evaluating environmental attitudes in 31 LLMs, finding they often exhibit progressive environmental views and contextual sensitivity, highlighting issues of steerability and normative reliability in sustainability applications.
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
EnvFactory automates the creation of executable tool environments and natural multi-turn trajectories for training LLMs with agentic reinforcement learning, achieving superior performance on benchmarks like BFCLv3 and MCP-Atlas with fewer environments than prior work.
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
This survey paper provides a unified review of LLM-based multi-agent systems, focusing on collaboration, failure attribution, and self-evolution through the LIFE framework, identifying open challenges and proposing a cross-stage research agenda.