Tag
This paper introduces a Pareto-guided teacher alignment method for fair personalized text generation, aiming to balance multiple objectives in language model outputs.
This paper proposes PAFO, a Pareto fairness optimization framework to mitigate personalized reward bias in reward models for LLMs, improving accuracy for minority user groups without harming majority groups.