Tag
Proposes a multi-objective reinforcement learning framework combining semantic embeddings with Pareto-DQN to balance engagement, diversity, and fairness in recommendations, mitigating filter bubbles.
Proposes a multimodal framework for fair Mild Cognitive Impairment detection from speech, using unlearning via gradient reversal to reduce demographic bias and improve performance across subgroups.
Introduces P²CE, a model-agnostic algorithm for generating plausible Pareto-optimal counterfactual explanations that balances feasibility, plausibility, and computational efficiency using an isolation forest outlier detector and SHAP values.
A new benchmark called StylisticBias systematically evaluates attribute-level social bias in multimodal large language models, finding that a small set of visual cues like fashion style drive most biases.
This paper systematically compares equitable tokenizers for multilingual LLMs across 11 Southeast Asian languages, finding that Parity-aware BPE achieves the best efficiency-equity trade-off and that cross-lingual fairness and tokenization efficiency are not fundamentally at odds.
This paper proposes an α-Fair Individual Solvent Premium (α-FISP) framework for insurance pricing that balances actuarial fairness and solidarity fairness while ensuring solvency, using constrained optimization to yield a continuum of pricing solutions.
Polar is a 4,026-instance multiple-choice benchmark for evaluating political bias in LLMs across U.S. and South Korean political contexts, measuring bias through option-level likelihoods. Experiments on 38 LLMs show systematic bias patterns varying by political context, issue category, and presentation language.
Introduces Face-Fairness (FF), a plug-and-play framework for bias mitigation in deepfake detection, featuring Face-Feature Tuning (FFT) as the first demographic label-free fairness method that improves group accuracy and reduces performance gaps across demographics.
Investigates how self-supervised speech recognition models encode speaker group information (gender, age, dialect, ethnicity, native speaker status) across layers, and how finetuning for tasks like ASR or speaker identification affects this encoding.
This paper introduces a Pareto-guided teacher alignment method for fair personalized text generation, aiming to balance multiple objectives in language model outputs.
This paper proposes PAFO, a Pareto fairness optimization framework to mitigate personalized reward bias in reward models for LLMs, improving accuracy for minority user groups without harming majority groups.
This paper introduces AI-MASLD, a stress-audit framework for medical LLMs that reveals how benchmark accuracy can hide serious safety failures, and demonstrates that open-weight models can match or exceed proprietary ones on safety dimensions.
Aquifer is an MCP runtime that provides bounded queues, fairness controls, and dynamic pacing to handle rate limits and traffic spikes in AI agent systems. It also introduces the Aqueduct Protocol for dynamic flow state communication.
The paper proposes treating fairness as a symmetry operation in machine learning classifiers, implementing loss-based regularization to enforce invariance under swapping of sensitive attributes while holding merit features fixed. The framework achieves over 90% bias reduction with minimal accuracy loss and requires no causal graph knowledge.
This large-scale study of 3.4 million job applicants across 156 employers reveals that algorithmic monocultures in hiring algorithms from a single vendor cause racial disparities and systemic rejections, with 25.87% of Black applicants and 14.74% of Asian applicants adversely impacted.
Researchers from the University of Amsterdam propose a tabular reinforcement learning approach to the Metro Network Expansion Problem, showing it achieves comparable performance to Deep RL while reducing training episodes by 18x and carbon emissions by 12x on average. The method also incorporates social equity criteria and is evaluated on real-world metro networks in Xi'an and Amsterdam.
This paper investigates the impact of demographic bias (sex and age) on skin lesion classification using ResNet models, finding that sex biases stem from data imbalances while age biases consistently favor younger groups, and evaluating multi-task and adversarial learning mitigation strategies.
This paper investigates how LLMs produce different outcomes based on conversational context, finding that topic, rather than explicit user demographics, is the primary driver of disparities in high-stakes scenarios like salary advice.
This paper presents a multi-domain red teaming framework for evaluating safety, robustness, and fairness of medical LLMs across 690 clinically grounded scenarios. Results show that high aggregate accuracy can mask critical failures, and hybrid evaluation with clinician oversight is necessary for credible safety assessment.
Introduces TrustLDM, a comprehensive benchmark for evaluating safety, privacy, and fairness of Language Diffusion Models, revealing that their alignment degrades with malicious post contexts. Proposes an automatic evaluation framework, TrustLDM-Auto, to identify vulnerable configurations.