Machine Learning and the Random Walk Puzzle: Forecasting the CAD/USD Exchange Rate with Expanding Window Evaluation and SHAP Interpretability

arXiv cs.LG Papers

Summary

This paper examines whether ML models can beat the random walk benchmark in forecasting USD/CAD exchange rates, finding that only linear regression statistically outperforms the naive model, with SHAP analysis showing short-term lags dominate predictions.

arXiv:2606.15058v1 Announce Type: new Abstract: This study examines whether machine learning (ML) models can outperform the naive random walk benchmark in forecasting the monthly USD/CAD exchange rate. Using daily data from the Bank of Canada spanning January 2017 to May 2026, resampled into 113 monthly observations, five ML models are evaluated: linear regression, random forest, gradient boosting, XGBoost, and AdaBoost. These models are benchmarked against the naive random walk model and exponential smoothing with Holt-Winters seasonality (ETS). All models are evaluated using an expanding-window framework to maintain strict out-of-sample integrity, and forecast-accuracy differences are assessed using the Diebold-Mariano (DM) test. Structural break detection identifies four significant breakpoints in the series, corresponding to the escalation of the US-China trade war in 2018, the COVID-19 economic recovery in 2020, the peak of the Bank of Canada rate-hiking cycle in 2022, and the start of the Bank of Canada rate-cutting cycle in 2024. SHAP, or Shapley Additive Explanations, analysis is applied to interpret the drivers of the best-performing ML model. The results show that the naive random walk model remains a formidable benchmark. Linear regression is the only model that statistically outperforms the naive random walk model, with a DM statistic of 3.0585 and a p value of 0.0071, whereas the ML ensemble models show only marginal differences. Random Forest with an expanding-window framework achieves the lowest MAPE of 1.17 percent among all models except the random walk. SHAP analysis confirms that short-term lags, particularly lag1 and lag2, and recent rolling means dominate predictions, consistent with the near-random-walk behavior of exchange rates.
Original Article
View Cached Full Text

Cached at: 06/16/26, 11:37 AM

# Machine Learning and the Random Walk Puzzle: Forecasting the CAD/USD Exchange Rate with Expanding Window Evaluation and SHAP Interpretability
Source: [https://arxiv.org/abs/2606.15058](https://arxiv.org/abs/2606.15058)
[View PDF](https://arxiv.org/pdf/2606.15058)

> Abstract:This study examines whether machine learning \(ML\) models can outperform the naive random walk benchmark in forecasting the monthly USD/CAD exchange rate\. Using daily data from the Bank of Canada spanning January 2017 to May 2026, resampled into 113 monthly observations, five ML models are evaluated: linear regression, random forest, gradient boosting, XGBoost, and AdaBoost\. These models are benchmarked against the naive random walk model and exponential smoothing with Holt\-Winters seasonality \(ETS\)\. All models are evaluated using an expanding\-window framework to maintain strict out\-of\-sample integrity, and forecast\-accuracy differences are assessed using the Diebold\-Mariano \(DM\) test\. Structural break detection identifies four significant breakpoints in the series, corresponding to the escalation of the US\-China trade war in 2018, the COVID\-19 economic recovery in 2020, the peak of the Bank of Canada rate\-hiking cycle in 2022, and the start of the Bank of Canada rate\-cutting cycle in 2024\. SHAP, or Shapley Additive Explanations, analysis is applied to interpret the drivers of the best\-performing ML model\. The results show that the naive random walk model remains a formidable benchmark\. Linear regression is the only model that statistically outperforms the naive random walk model, with a DM statistic of 3\.0585 and a p value of 0\.0071, whereas the ML ensemble models show only marginal differences\. Random Forest with an expanding\-window framework achieves the lowest MAPE of 1\.17 percent among all models except the random walk\. SHAP analysis confirms that short\-term lags, particularly lag1 and lag2, and recent rolling means dominate predictions, consistent with the near\-random\-walk behavior of exchange rates\.

## Submission history

From: Edmund Agyemang \[[view email](https://arxiv.org/show-email/e13c952d/2606.15058)\] **\[v1\]**Sat, 13 Jun 2026 02:13:18 UTC \(1,675 KB\)

Similar Articles

AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets

Papers with Code Trending

This paper introduces AI-Trader, the first fully automated live benchmark for evaluating LLMs in financial decision-making across US stocks, A-shares, and cryptocurrencies. It highlights that general intelligence does not guarantee trading success and emphasizes the importance of risk control in autonomous agents.

Uncertainty-Aware Longitudinal Forecasting of Alzheimer's Disease Progression Using Deep Learning

arXiv cs.AI

This paper proposes a probabilistic framework for Alzheimer's disease progression forecasting that combines ordinal diagnosis prediction, multi-horizon trajectory generation, and decomposed uncertainty estimation using a Temporal Fusion Transformer encoder and an autoregressive Mixture Density Network. The model outperforms baselines on ADNI data, achieving near-nominal 90% credible interval coverage with clinically meaningful uncertainty signals.

Algometrics: Forecasting Under Algorithmic Feedback

arXiv cs.LG

This paper introduces algometrics, a framework for time series forecasting under algorithmic feedback, proving that deployment risk differs from historical risk and is not identifiable from passive data alone. It provides methods for estimating deployment risk using interventions or randomized actions.