Tag
The paper identifies a failure mode where predictors collapse to a point on unidentified counterfactual couplings and proposes a framework using a positive semidefinite coupling kernel to bound counterfactuals, showing that prediction cannot represent uncertainty over cross-world couplings and that enforcing kernel constraints yields tractable bounds.
Introduces CICL, a decision-aware context layer that selects and compresses evidence for tool-using LLM agents by treating context as a decision-time intervention, using counterfactual-inspired scoring and typed memory cards under a token budget. Experiments on SWE-bench and RepoBench show concrete gains in retrieval accuracy and action criticality.
This paper introduces the Causal Sensitivity Score (CSS), an interventional metric that evaluates whether clinical LLMs and agents appropriately update their recommendations when patient inputs change along clinically meaningful dimensions. It reveals hidden capability profiles not captured by standard coverage-based metrics, exposing safety blind spots and structural responsiveness deficits.
COFT is a training-free decoding method that applies token-level fairness control and conformal calibration to reduce bias in chain-of-thought reasoning of large language models, achieving 30-55% bias reduction with minimal computational overhead.
Introduces CAFE, a benchmark for evaluating whether promptable segmentation models truly understand concepts by using counterfactual attribute manipulation, revealing that accurate mask prediction does not guarantee faithful semantic grounding.
CiPO is a novel framework for machine unlearning in Large Reasoning Models that uses iterative preference optimization with counterfactual reasoning traces to selectively remove unwanted knowledge while preserving reasoning abilities. The method addresses the challenge of unlearning in models that rely on chain-of-thought reasoning by generating logically valid alternative reasoning paths during training.