counterfactual

#counterfactual

TokenMem: Faithful Knowledge Injection for Frozen LLMs

arXiv cs.AI ↗ · 6h ago Cached

TokenMem injects knowledge into frozen LLMs via a dedicated cross-attention channel, training a thin gating adapter through two-phase curriculum to improve knowledge compliance under counterfactual knowledge, achieving 69-70% KC compared to 20-52% for vanilla RAG.

0 favorites 0 likes

#counterfactual

Concept-based Visual Counterfactual Explanations with Diffusion Models

arXiv cs.AI ↗ · 6h ago Cached

Introduces C-VCE, a diffusion framework that builds an interpretable concept bottleneck layer into the generative model, enabling human-guided visual counterfactual explanations without relying on external noise-robust classifiers.

0 favorites 0 likes

#counterfactual

Resist and Update: Counterfactual Report Coordinates for Incentive-Compatible LLMs

arXiv cs.AI ↗ · 2026-07-15 Cached

This paper introduces a method for ensuring LLMs report their true beliefs by using counterfactual report coordinates that resist pressure but remain responsive to genuine evidence. The approach achieves high performance on a benchmark, demonstrating a causal certificate for internal incentive compatibility.

0 favorites 0 likes

#counterfactual

Counterfactual Residual Data Augmentation for Regression

arXiv cs.LG ↗ · 2026-06-30 Cached

Proposes Counterfactual Residual Data Augmentation (CRDA) for tabular regression, leveraging residual invariance under feature perturbations to generate realistic training samples, achieving significant MSE reduction on benchmarks.

0 favorites 0 likes

#counterfactual

Counterfactual Optimization of Baseball Pitch Sequences and Estimation of Its Impact on Season-Level Statistics

arXiv cs.LG ↗ · 2026-06-17 Cached

This paper uses a Transformer-based model on MLB Statcast data to counterfactually optimize baseball pitch sequences, finding that optimizing both final and setup pitches can improve season-level statistics like K/9 by over 1.0.

0 favorites 0 likes

#counterfactual

A Definition of Good Explanations and the Challenges Explaining LLM Outputs

arXiv cs.AI ↗ · 2026-06-16 Cached

This paper proposes a definition of good explanations based on counterfactuals and prior beliefs, and discusses the inherent difficulties in explaining LLM outputs under this definition.

0 favorites 0 likes

#counterfactual

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

arXiv cs.AI ↗ · 2026-06-10 Cached

The paper identifies a failure mode where predictors collapse to a point on unidentified counterfactual couplings and proposes a framework using a positive semidefinite coupling kernel to bound counterfactuals, showing that prediction cannot represent uncertainty over cross-world couplings and that enforcing kernel constraints yields tractable bounds.

0 favorites 0 likes

#counterfactual

Decision-Aware Memory Cards: Counterfactual-Inspired Context Selection and Compression for Tool-Using LLM Agents

arXiv cs.AI ↗ · 2026-06-09 Cached

Introduces CICL, a decision-aware context layer that selects and compresses evidence for tool-using LLM agents by treating context as a decision-time intervention, using counterfactual-inspired scoring and typed memory cards under a token budget. Experiments on SWE-bench and RepoBench show concrete gains in retrieval accuracy and action criticality.

0 favorites 0 likes

#counterfactual

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents

arXiv cs.LG ↗ · 2026-06-01 Cached

This paper introduces the Causal Sensitivity Score (CSS), an interventional metric that evaluates whether clinical LLMs and agents appropriately update their recommendations when patient inputs change along clinically meaningful dimensions. It reveals hidden capability profiles not captured by standard coverage-based metrics, exposing safety blind spots and structural responsiveness deficits.

0 favorites 0 likes

#counterfactual

COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models

arXiv cs.CL ↗ · 2026-06-01 Cached

COFT is a training-free decoding method that applies token-level fairness control and conformal calibration to reduce bias in chain-of-thought reasoning of large language models, achieving 30-55% bias reduction with minimal computational overhead.

0 favorites 0 likes

#counterfactual

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Hugging Face Daily Papers ↗ · 2026-05-10 Cached

Introduces CAFE, a benchmark for evaluating whether promptable segmentation models truly understand concepts by using counterfactual attribute manipulation, revealing that accurate mask prediction does not guarantee faithful semantic grounding.

0 favorites 0 likes

#counterfactual

CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization

arXiv cs.CL ↗ · 2026-04-20 Cached

CiPO is a novel framework for machine unlearning in Large Reasoning Models that uses iterative preference optimization with counterfactual reasoning traces to selectively remove unwanted knowledge while preserving reasoning abilities. The method addresses the challenge of unlearning in models that rely on chain-of-thought reasoning by generating logically valid alternative reasoning paths during training.

0 favorites 0 likes

counterfactual

Submit Feedback