COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models
Summary
COFT is a training-free decoding method that applies token-level fairness control and conformal calibration to reduce bias in chain-of-thought reasoning of large language models, achieving 30-55% bias reduction with minimal computational overhead.
View Cached Full Text
Cached at: 06/01/26, 09:26 AM
# COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models Source: [https://arxiv.org/abs/2605.30641](https://arxiv.org/abs/2605.30641) [View PDF](https://arxiv.org/pdf/2605.30641) > Abstract:Large language models \(LLMs\) can reveal and amplify societal biases during chain\-of\-thought \(CoT\) generation\. We present COFT \(Chain of Fair Thought\), a training\-free decoding method that applies token\-level fairness control at decode time, with distribution\-free marginal validity guarantees \(under exchangeability\) for any frozen causal language model\. COFT operates in three stages\. First, it creates a masked counterfactual prompt by replacing sensitive spans with neutral tokens\. Second, it compares the factual and masked logit distributions through lightweight logit fusion to attenuate attribute\-driven biases\. Third, it uses dual\-branch split\-conformal calibration to certify per\-step candidate token sets at a user\-chosen risk level\. We evaluate COFT across six models and multiple bias benchmarks\. Our method reduces standard bias metrics by 30\-55% \(median 38%\) while preserving task utility and language quality\. Reasoning accuracies remain unchanged within run\-to\-run noise margins\. The computational overhead is modest, equivalent to one additional cached forward pass \(<=11%\)\. COFT offers a clear, auditable path to safer CoT generation with significant bias reduction, negligible utility loss, and no requirement for retraining, auxiliary classifiers, or weight access\. ## Submission history From: Arya Fayyazi \[[view email](https://arxiv.org/show-email/c4a9d3d9/2605.30641)\] **\[v1\]**Thu, 28 May 2026 22:52:15 UTC \(2,107 KB\)
Similar Articles
Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning
Proposes ProxyCoT, a training framework that improves long-context reasoning in large language models by first obtaining chain-of-thought reasoning traces on short proxy contexts (via reinforcement learning or distillation) and then grounding them in full long contexts through supervised fine-tuning. Experiments show consistent improvements over baselines with reduced computational cost.
Thinking Before Constraining: A Unified Decoding Framework for Large Language Models
A new hybrid decoding framework called In-Writing is proposed, which delays constraint application until after a trigger token, combining free-form reasoning with structured generation for improved accuracy in classification and reasoning tasks.
Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding
CoRD is a collaborative multi-teacher decoding framework that synthesizes reasoning trajectories through predictive perplexity scoring and beam search, enabling efficient distillation of large reasoning models with high-quality outputs and generalized performance.
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn
This paper investigates many-shot chain-of-thought in-context learning for reasoning tasks, revealing that standard scaling rules do not transfer and proposing Curvilinear Demonstration Selection (CDS) for improved ordering, achieving up to 5.42 percentage-point gain.
Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework
UL-XCoT introduces a unified logic space to prune low-quality multilingual reasoning paths, cutting >50% token cost while improving accuracy and robustness on low-resource languages.