counterfactual

Tag

Cards List
#counterfactual

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

arXiv cs.AI · yesterday Cached

The paper identifies a failure mode where predictors collapse to a point on unidentified counterfactual couplings and proposes a framework using a positive semidefinite coupling kernel to bound counterfactuals, showing that prediction cannot represent uncertainty over cross-world couplings and that enforcing kernel constraints yields tractable bounds.

0 favorites 0 likes
#counterfactual

Decision-Aware Memory Cards: Counterfactual-Inspired Context Selection and Compression for Tool-Using LLM Agents

arXiv cs.AI · 2d ago Cached

Introduces CICL, a decision-aware context layer that selects and compresses evidence for tool-using LLM agents by treating context as a decision-time intervention, using counterfactual-inspired scoring and typed memory cards under a token budget. Experiments on SWE-bench and RepoBench show concrete gains in retrieval accuracy and action criticality.

0 favorites 0 likes
#counterfactual

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents

arXiv cs.LG · 2026-06-01 Cached

This paper introduces the Causal Sensitivity Score (CSS), an interventional metric that evaluates whether clinical LLMs and agents appropriately update their recommendations when patient inputs change along clinically meaningful dimensions. It reveals hidden capability profiles not captured by standard coverage-based metrics, exposing safety blind spots and structural responsiveness deficits.

0 favorites 0 likes
#counterfactual

COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models

arXiv cs.CL · 2026-06-01 Cached

COFT is a training-free decoding method that applies token-level fairness control and conformal calibration to reduce bias in chain-of-thought reasoning of large language models, achieving 30-55% bias reduction with minimal computational overhead.

0 favorites 0 likes
#counterfactual

From Pixels to Concepts: Do Segmentation Models Understand What They Segment?

Hugging Face Daily Papers · 2026-05-10 Cached

Introduces CAFE, a benchmark for evaluating whether promptable segmentation models truly understand concepts by using counterfactual attribute manipulation, revealing that accurate mask prediction does not guarantee faithful semantic grounding.

0 favorites 0 likes
#counterfactual

CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization

arXiv cs.CL · 2026-04-20 Cached

CiPO is a novel framework for machine unlearning in Large Reasoning Models that uses iterative preference optimization with counterfactual reasoning traces to selectively remove unwanted knowledge while preserving reasoning abilities. The method addresses the challenge of unlearning in models that rely on chain-of-thought reasoning by generating logically valid alternative reasoning paths during training.

0 favorites 0 likes
← Back to home

Submit Feedback