Pareto-Guided Teacher Alignment for Fair Personalized Text Generation

arXiv cs.CL Papers

Summary

This paper introduces a Pareto-guided teacher alignment method for fair personalized text generation, aiming to balance multiple objectives in language model outputs.

arXiv:2606.10126v1 Announce Type: new Abstract: Personalized persuasive text generation can improve relevance and engagement, but demographic conditioning may also introduce unequal framing across groups. We study fairness mitigation in personalized generation as a constrained multi-objective alignment problem: reduce demographic disparities while preserving personalization fidelity. We propose a Pareto-guided teacher alignment framework that combines revision-based candidate generation, pair-aware feasibility gating, Pareto-style candidate selection, and optional preference optimization through supervised fine-tuning and direct preference optimization. We evaluate the framework on climate change and vaccination persuasion tasks using a controlled context-rich demographic grid with matched gender and age pairs and a unified five-audit evaluation suite spanning persuasion bias, formality disparity, emotional framing disparity, lexical association disparity, and personalization fidelity. Across both domains and cross-family transfer settings, no single alignment strategy dominates all objectives simultaneously. Instead, methods occupy different regions of a fairness-personalization Pareto frontier: some achieve stronger disparity reductions, while others better preserve personalization or demographic stability. Our results show that fairness mitigation effects are objective-dependent and transfer inconsistently across domains and model families, motivating bounded-regression, multi-audit model selection over single-metric optimization for fairness-sensitive personalized generation.
Original Article
View Cached Full Text

Cached at: 06/10/26, 06:10 AM

# Pareto-Guided Teacher Alignment for Fair Personalized Text Generation
Source: [https://arxiv.org/abs/2606.10126](https://arxiv.org/abs/2606.10126)
Bibliographic Tools

## Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Code, Data, Media

## Code, Data and Media Associated with this Article

Demos

## Demos

Related Papers

## Recommenders and Search Tools

About arXivLabs

## arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website\.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy\. arXiv is committed to these values and only works with partners that adhere to them\.

Have an idea for a project that will add value for arXiv's community?[**Learn more about arXivLabs**](https://info.arxiv.org/labs/index.html)\.

Similar Articles

Our approach to alignment research

OpenAI Blog

OpenAI outlines their alignment research approach, highlighting reinforcement learning from human feedback (RLHF) as their primary technique for aligning deployed language models like InstructGPT. They discuss achieving significant preference over 100x larger models while using minimal compute, but acknowledge current limitations and propose a long-term strategy of using AI systems to accelerate alignment research beyond what humans can achieve alone.