direct-preference-optimization

Tag

Cards List
#direct-preference-optimization

Emergent Alignment

arXiv cs.AI · 2026-06-20 Cached

This paper introduces Emergent Alignment, a self-supervised method that endows LLMs with a conscience step to review their own outputs and uses Direct Preference Optimization to steer away from unethical behavior, enabling online alignment without external judges.

0 favorites 0 likes
#direct-preference-optimization

Direct Preference Optimization for Chatbot Fine-Tuning: An Empirical Study

arXiv cs.CL · 2026-06-12 Cached

This paper presents an empirical study of Direct Preference Optimization (DPO) for fine-tuning a large language model, showing that DPO simplifies the training pipeline and achieves competitive performance while addressing training instability.

0 favorites 0 likes
#direct-preference-optimization

Direct Preference Optimization Beyond Chatbots

Hugging Face Blog · 2026-06-03 Cached

Direct Preference Optimization (DPO) is applied to OCR tasks beyond chatbots, showing significant reduction in text degeneration across multiple model families, with an average reduction of 59.4%.

0 favorites 0 likes
#direct-preference-optimization

Curriculum Learning for Safety Alignment

arXiv cs.LG · 2026-05-27 Cached

This paper proposes Staged-Competence, a curriculum learning framework for DPO-based safety alignment that organizes preference data by difficulty, improving robustness and data efficiency while preserving general capabilities.

0 favorites 0 likes
#direct-preference-optimization

Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs

arXiv cs.CL · 2026-05-26 Cached

This paper applies Direct Preference Optimization (DPO) to align Audio LLMs for transcribing English-Mandarin code-switching speech, achieving up to 89.6% MER reduction in-distribution and 20% out-of-distribution. It identifies three failure modes—language omission, translation instead of transcription, and hallucination—and shows that preference-based alignment effectively elicits correct code-switching behavior from multilingual Audio LLMs.

0 favorites 0 likes
#direct-preference-optimization

Token-weighted Direct Preference Optimization with Attention

arXiv cs.CL · 2026-05-22 Cached

Proposes AttentionPO, a token-weighted direct preference optimization method that uses attention from the LLM itself to estimate token weights, improving alignment performance on AlpacaEval, MT-Bench, and ArenaHard without requiring a separate reward model.

0 favorites 0 likes
#direct-preference-optimization

Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training

arXiv cs.LG · 2026-05-13 Cached

This paper analyzes spurious correlation learning in preference optimization methods like DPO, identifying mechanisms such as mean spurious bias and causal-spurious leakage. It proposes 'tie training' using equal-utility preference pairs as a mitigation strategy to reduce reliance on spurious features without degrading causal learning.

0 favorites 0 likes
#direct-preference-optimization

$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin

arXiv cs.LG · 2026-05-13 Cached

This paper introduces xi-DPO, a novel preference optimization method that reformulates the objective to minimize distance to optimal ratio reward margins, addressing hyperparameter tuning challenges in SimPO. Experimental results show that xi-DPO outperforms existing methods on open benchmarks.

0 favorites 0 likes
#direct-preference-optimization

DocAtlas: Multilingual Document Understanding Across 80+ Languages

Hugging Face Daily Papers · 2026-05-12 Cached

DocAtlas is a framework that creates high-fidelity OCR datasets and benchmarks across 82 languages, using differential rendering and synthetic generation. It demonstrates that Direct Preference Optimization improves multilingual model adaptation without degrading base-language performance.

0 favorites 0 likes
#direct-preference-optimization

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

arXiv cs.AI · 2026-05-11 Cached

This paper introduces CASPO, a framework for aligning token-level confidence with step-wise logical correctness in large reasoning models using iterative Direct Preference Optimization. It also proposes Confidence-aware Thought (CaT) for dynamically pruning uncertain reasoning branches during inference to improve reliability and efficiency.

0 favorites 0 likes
← Back to home

Submit Feedback