preference-aggregation

Tag

Cards List
#preference-aggregation

Hidden Consensus:Preference-Validity Compression in Human Feedback

arXiv cs.CL · yesterday Cached

This paper argues that standard RLHF's scalarization of human preferences collapses multiple valid interpretations into a single target, mis-measuring alignment in culturally plural societies. Analyzing a Malaysian dataset, they find 79% of prompts have multiple majority-supported responses that single-winner aggregation discards.

0 favorites 0 likes
#preference-aggregation

What Do People Actually Want From AI? Mapping Preference Plurality

arXiv cs.CL · 3d ago Cached

This paper analyzes 1,500 open-ended responses from 75 countries to reveal that people have diverse and often conflicting preferences for AI, with truthfulness being the only widely demanded value (49%), yet defined in incompatible ways. It argues that current RLHF methods flatten these pluralistic preferences into universal reward models, perpetuating epistemic violence.

0 favorites 0 likes
#preference-aggregation

Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

arXiv cs.LG · 2026-05-26 Cached

Proposes a truthful online preference aggregation mechanism for LLM fine-tuning in mobile crowdsourcing, addressing strategic worker misreporting and achieving sublinear regret.

0 favorites 0 likes
← Back to home

Submit Feedback