Tag
This paper proposes a distributionally robust listwise preference optimization method for LLM alignment that handles ranking-label uncertainty, with a tractable objective and strong convergence guarantees.