ambiguous-labels

#ambiguous-labels

SHALA-LLM: Smartly Handling Ambiguous Labels in Aligning LLMs

arXiv cs.LG ↗ · 3d ago Cached

SHALA-LLM is a reinforcement learning framework that enables LLMs to learn directly from annotator distributions and dynamically prioritize highly ambiguous samples during alignment, improving agreement with human label distributions and classification performance.

0 favorites 0 likes

ambiguous-labels

SHALA-LLM: Smartly Handling Ambiguous Labels in Aligning LLMs

Submit Feedback