direct-preference-optimization

Tag

Cards List
#direct-preference-optimization

Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training

arXiv cs.LG · 17h ago Cached

This paper analyzes spurious correlation learning in preference optimization methods like DPO, identifying mechanisms such as mean spurious bias and causal-spurious leakage. It proposes 'tie training' using equal-utility preference pairs as a mitigation strategy to reduce reliance on spurious features without degrading causal learning.

0 favorites 0 likes
#direct-preference-optimization

$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin

arXiv cs.LG · 17h ago Cached

This paper introduces xi-DPO, a novel preference optimization method that reformulates the objective to minimize distance to optimal ratio reward margins, addressing hyperparameter tuning challenges in SimPO. Experimental results show that xi-DPO outperforms existing methods on open benchmarks.

0 favorites 0 likes
#direct-preference-optimization

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

arXiv cs.AI · 2d ago Cached

This paper introduces CASPO, a framework for aligning token-level confidence with step-wise logical correctness in large reasoning models using iterative Direct Preference Optimization. It also proposes Confidence-aware Thought (CaT) for dynamically pruning uncertain reasoning branches during inference to improve reliability and efficiency.

0 favorites 0 likes
← Back to home

Submit Feedback