plackett-luce

Tag

Cards List
#plackett-luce

Distributionally Robust Listwise Preference Optimization

arXiv cs.AI · 14h ago Cached

This paper proposes a distributionally robust listwise preference optimization method for LLM alignment that handles ranking-label uncertainty, with a tractable objective and strong convergence guarantees.

0 favorites 0 likes
← Back to home

Submit Feedback