temporal-feedback

#temporal-feedback

Learn to Match: Two-Sided Matching with Temporally Extended Feedback

arXiv cs.LG ↗ · 5d ago Cached

This paper introduces a framework for two-sided matching with temporally extended feedback, formulating it as a partially observable Markov game with costly screening, noisy observations, and evolving latent profiles. The authors present Learn2Match, a multi-agent reinforcement learning benchmark, and show that independent PPO outperforms bandit baselines in social welfare but incurs higher information-friction loss.

0 favorites 0 likes

temporal-feedback

Learn to Match: Two-Sided Matching with Temporally Extended Feedback

Submit Feedback