Beyond Agent Architecture: Execution Assumptions and Reproducibility in LLM-Based Trading Systems

arXiv cs.AI Papers

Summary

This paper reviews and audits execution realism in LLM-based trading research, proposing clearer reporting standards for reproducibility and evaluation comparability.

arXiv:2606.08285v1 Announce Type: new Abstract: Large language models (LLMs) and agentic systems are increasingly proposed for financial trading, yet their reported performance remains difficult to compare because studies vary in data provenance, temporal split discipline, execution timing, turnover treatment, and transaction-cost modeling. This article presents a targeted topical review and reproducibility audit of execution realism in LLM-based trading research. A coded evidence matrix covering 30 trade-relevant primary studies is used to assess point-in-time controls, split transparency, held-out evaluation, cost and turnover treatment, execution semantics, universe definition, and artifact release. Across the audited sample, architecture reporting is generally clearer than the evaluation assumptions needed to judge whether a trading result is economically interpretable or reproducible. A 10-equity worked example is included only as a methodological scaffold to illustrate how explicit friction and timing choices can materially compress active-strategy results. The main conclusion is that the next useful step for LLM trading research is not only better agent design, but also clearer reporting standards for execution realism, reproducibility, and evaluation comparability.
Original Article
View Cached Full Text

Cached at: 06/09/26, 08:55 AM

# Beyond Agent Architecture: Execution Assumptions and Reproducibility in LLM-Based Trading Systems
Source: [https://arxiv.org/abs/2606.08285](https://arxiv.org/abs/2606.08285)
[View PDF](https://arxiv.org/pdf/2606.08285)

> Abstract:Large language models \(LLMs\) and agentic systems are increasingly proposed for financial trading, yet their reported performance remains difficult to compare because studies vary in data provenance, temporal split discipline, execution timing, turnover treatment, and transaction\-cost modeling\. This article presents a targeted topical review and reproducibility audit of execution realism in LLM\-based trading research\. A coded evidence matrix covering 30 trade\-relevant primary studies is used to assess point\-in\-time controls, split transparency, held\-out evaluation, cost and turnover treatment, execution semantics, universe definition, and artifact release\. Across the audited sample, architecture reporting is generally clearer than the evaluation assumptions needed to judge whether a trading result is economically interpretable or reproducible\. A 10\-equity worked example is included only as a methodological scaffold to illustrate how explicit friction and timing choices can materially compress active\-strategy results\. The main conclusion is that the next useful step for LLM trading research is not only better agent design, but also clearer reporting standards for execution realism, reproducibility, and evaluation comparability\.

## Submission history

From: Junyi Yao \[[view email](https://arxiv.org/show-email/c529af5c/2606.08285)\] **\[v1\]**Sat, 6 Jun 2026 18:14:29 UTC \(84 KB\)

Similar Articles

Agentic Trading: When LLM Agents Meet Financial Markets

arXiv cs.AI

This paper presents a systematic survey and evidence map of 77 studies on LLM-based trading agents, finding that architectural experimentation is expanding rapidly but evaluation protocols, execution semantics, and reproducibility remain critical bottlenecks.

TradingAgents: Multi-Agents LLM Financial Trading Framework

Papers with Code Trending

This paper introduces TradingAgents, a multi-agent LLM framework that simulates real-world trading firms to improve stock trading performance. It utilizes specialized agents for analysis and risk management, demonstrating superior results in cumulative returns and Sharpe ratio compared to baselines.

Representation Signatures and Risk-Feedback Alignment in LLM Trading Agents

arXiv cs.LG

This paper investigates the behavioral alignment and representation dynamics of LLM agents in financial trading, introducing the TradeArena testbed and finding measurable pre-failure signatures in planning embeddings that can predict drawdowns with high accuracy across multiple frontier models and stress conditions.

QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading

Papers with Code Trending

QuantAgent is a multi-agent LLM framework designed specifically for high-frequency trading, using four specialized agents (Indicator, Pattern, Trend, Risk) to make rapid, risk-aware decisions based on short-horizon signals. In zero-shot evaluations across ten financial instruments including Bitcoin and Nasdaq futures, it outperforms existing neural and rule-based baselines in predictive accuracy and cumulative return.