When Tabular Foundation Models Meet Strategic Tabular Data: A Prior Alignment Approach

arXiv cs.AI Papers

Summary

This paper studies whether tabular foundation models based on pretrained prior-data fitted networks (PFNs) can generalize to strategic tabular data where individuals modify features after deployment. It proposes Strategic Prior-data Fitted Network (SPN), an inference-time framework that aligns PFN predictions with the post-manipulation distribution without retraining.

arXiv:2605.19662v1 Announce Type: new Abstract: Tabular foundation models based on pretrained prior-data fitted networks~(PFNs) have shown strong generalization on diverse tabular tasks, but they are typically designed for \emph{non-strategic} settings where data distributions are independent of deployed classifiers. In many real-world decision scenarios, however, individuals may strategically modify their features after deployment to obtain favorable outcomes, inducing a post-deployment distribution shift. This paper studies whether PFN-style tabular foundation models can generalize to such \emph{strategic} tabular data. We show that strategic manipulation creates a mismatch between the non-strategic prior learned during pretraining and the post-manipulation strategic prior, which leads to systematic prediction bias. To address this issue, we propose \textbf{Strategic Prior-data Fitted Network}~\textit{(SPN)}, an inference-time strategy-aware framework that adapts tabular foundation models to strategic environments without retraining. SPN constructs strategic in-context examples to approximate post-manipulation inputs and aligns PFN predictions with the induced strategic distribution. Experiments on real-world and synthetic tabular datasets show that SPN consistently improves robustness and predictive performance under strategic manipulation compared with both tabular foundation models and classical tabular methods.
Original Article

Similar Articles

TabPFN-3: Technical Report

arXiv cs.LG

TabPFN-3 is a new foundation model for tabular data, pretrained on synthetic data, that scales to 1M training rows while reducing training and inference time, achieving state-of-the-art performance on tabular prediction, time series, and relational data.

PriorLabs/TabPFN

GitHub Trending (daily)

TabPFN is introduced as a foundation model specifically designed for tabular data by PriorLabs.