Counterfactual Optimization of Baseball Pitch Sequences and Estimation of Its Impact on Season-Level Statistics
Summary
This paper uses a Transformer-based model on MLB Statcast data to counterfactually optimize baseball pitch sequences, finding that optimizing both final and setup pitches can improve season-level statistics like K/9 by over 1.0.
View Cached Full Text
Cached at: 06/17/26, 05:37 AM
# Counterfactual Optimization of Baseball Pitch Sequences and Estimation of Its Impact on Season-Level Statistics Source: [https://arxiv.org/abs/2606.17345](https://arxiv.org/abs/2606.17345) [View PDF](https://arxiv.org/pdf/2606.17345) > Abstract:Although pitch sequencing is a central topic in baseball analytics, previous studies have primarily focused on optimizing the final pitch within a single plate appearance, leaving the role of preceding setup pitches and their impact on long\-term season\-level performance insufficiently examined\. To address these issues, this study conducted counterfactual analyses using MLB Statcast data\. A Transformer\-based machine\-learning model was trained to predict whether a target pitch would result in an in\-play outcome or swing\-out\. Counterfactual pitch sequences were then generated by replacing either the final pitch or the preceding setup pitch with alternative pitch types and locations while keeping the surrounding contextual information fixed\. Optimal counterfactual selections were defined as those that minimized the predicted in\-play probability, and their expected effects on pitchers' seasonal statistics were estimated using regression models linking model outputs to season statistics\. The results suggest that the optimization of both final and setup pitches may substantially influence season\-level performance, including improvements of more than 1\.0 in K/9\. The analyses also provided several practical insights, including velocity\-band\-specific effective locations, the importance of pitch commands, and the expansion of pitch\-selection options through middle\-velocity pitches\. These findings quantitatively support the strategic importance of pitch sequencing in baseball\. ## Submission history From: Ryota Takamido \[[view email](https://arxiv.org/show-email/b7a09756/2606.17345)\] **\[v1\]**Mon, 15 Jun 2026 22:47:06 UTC \(2,531 KB\)
Similar Articles
Conditional Attribute Estimation with Autoregressive Sequence Models
This paper introduces Conditional Attribute Transformers, a method for jointly estimating next-token probability and attribute values conditionally, enabling credit assignment, counterfactual analysis, and steerable generation in a single forward pass.
How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines
This paper provides the first systematic analysis of error sources in trajectory-based data attribution methods, identifies optimizer mismatch as the dominant error, proposes AdamW-influence to address it, and offers practical guidelines for data selection via a K-step look-ahead framework.
SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery
A new paper formalizes skill optimization for agents by treating markdown skill files as trainable parameters, using bounded edits validated against holdout sets. The approach transfers well between models and improves performance on procedural benchmarks.
Plan Before You Trade: Inference-Time Optimization for RL Trading Agents
FPILOT is a plugin inference-time optimization framework for RL trading agents that leverages price forecasts without retraining, yielding consistent improvements in returns and risk-adjusted metrics on the TradeMaster DJ30 benchmark.
@Yif_Yang: Introducing SkillOpt — an optimizer for agent skills. Instead of finetuning model weights, we treat a natural-language …
Introducing SkillOpt, an optimizer that treats natural-language skills as trainable external parameters instead of finetuning model weights. It uses bounded edits and validation gating to enable stable, controllable skill updates, achieving best or tied-best results across 52 settings on 6 benchmarks with 7 models.