plackett-luce

#plackett-luce

Distributionally Robust Listwise Preference Optimization

arXiv cs.AI ↗ · 14h ago Cached

This paper proposes a distributionally robust listwise preference optimization method for LLM alignment that handles ranking-label uncertainty, with a tractable objective and strong convergence guarantees.

0 favorites 0 likes

plackett-luce

Distributionally Robust Listwise Preference Optimization

Submit Feedback