causal-reasoning

#causal-reasoning

Causal-Audit: Explicit and Auditable Graph-based Reasoning via Target-Aware Causal Chain Construction

arXiv cs.AI ↗ · 4d ago Cached

Proposes Causal-Audit, a framework for explicit and auditable causal reasoning in LLMs using target-aware causal graph construction and path-level evidence aggregation, outperforming existing methods on benchmarks.

0 favorites 0 likes

#causal-reasoning

CausalDS: Benchmarking Causal Reasoning in Data-Science Agents

arXiv cs.AI ↗ · 2026-07-10 Cached

Introduces CausalDS, a benchmark for evaluating causal reasoning in LLM-based data science agents, using synthetic structural causal models and natural language stories to test associational, interventional, and counterfactual reasoning along with tool use and abstention.

0 favorites 0 likes

#causal-reasoning

Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

Hugging Face Daily Papers ↗ · 2026-06-23 Cached

This paper introduces CF-World, a counterfactual benchmark to evaluate whether text-to-image models rely on causal reasoning or mere pattern matching. Experiments show all models degrade sharply in counterfactual settings, suggesting their understanding is limited to tightly coupled visual-textual patterns rather than genuine causal reasoning.

0 favorites 0 likes

#causal-reasoning

Causal Discovery in the Era of Agents

Hugging Face Daily Papers ↗ · 2026-06-22 Cached

This paper argues that language model agents should assist causal discovery workflows by providing contextual support and explanations rather than generating causal conclusions, and introduces causal-learn+ platform to demonstrate this principle.

0 favorites 0 likes

#causal-reasoning

Vernier: Probing Representational Misalignment Behind Lexical Gaps in Causal Reasoning

arXiv cs.CL ↗ · 2026-06-16 Cached

This paper investigates why instruction-tuned language models give different answers to causal reasoning questions when variable names are replaced with placeholders, finding that the issue stems from representational misalignment rather than information loss. The authors introduce Vernier, a method using paired-view weight updates and mechanism inspection to reveal that answer-relevant content is still present in the placeholder view but misaligned.

0 favorites 0 likes

#causal-reasoning

Causal Object-Centric Models for Planning with Monte Carlo Tree Search

arXiv cs.AI ↗ · 2026-06-15 Cached

COMET is a model-based reinforcement learning algorithm that combines a frozen object-centric encoder with a transformer-based world model and Monte Carlo Tree Search, using causal attention to focus on task-relevant objects, achieving higher scores on visual RL benchmarks.

0 favorites 0 likes

#causal-reasoning

WISE: A Long-Horizon Agent in Minecraft with Why-Which Reasoning

arXiv cs.AI ↗ · 2026-06-12 Cached

WISE proposes a long-horizon agent framework for Minecraft that enhances low-level controllers with a Causal Event Graph for episodic memory, enabling robust recall under viewpoint changes and opportunistic task reordering via causal reasoning. It also features a multi-scale progressive exploration strategy and demonstrates improved success and efficiency on long-horizon sparse tasks.

0 favorites 0 likes

#causal-reasoning

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

arXiv cs.AI ↗ · 2026-06-04 Cached

This paper proposes 'Trivium,' a framework that introduces long-horizon temporal regret and epistemic regret as first-class objectives alongside outcome regret for causal-memory controllers in agentic LLM systems. The authors prove that outcome-only learning cannot distinguish causal from spurious structure without an intervention channel, while their approach achieves O(log E) temporal regret on CausalBench-Seq experiments versus linear growth for baselines.

0 favorites 0 likes

#causal-reasoning

Discrete-WAM: Unified Discrete Vision-Action Token Editing for World-Policy Learning

Hugging Face Daily Papers ↗ · 2026-06-04 Cached

Introduces Discrete-WAM, a unified discrete latent vision-action world policy that enables compositional causal reasoning and counterfactual reasoning in autonomous driving through aligned discrete tokens and a shared discrete diffusion framework.

0 favorites 0 likes

#causal-reasoning

PropLLM: Propagation-Aware Scene Reconstruction for Network Fault Diagnosis

arXiv cs.AI ↗ · 2026-06-02 Cached

PropLLM integrates hop-by-hop scene reconstruction with LLMs for network fault diagnosis. It uses a dual-layer knowledge graph and a temporal causal propagation attention mechanism to trace back along propagation paths, improving accuracy and reducing hallucinations.

0 favorites 0 likes

#causal-reasoning

Evaluating Bivariate Causal Statements Based on Mutual Compatibility

arXiv cs.AI ↗ · 2026-06-02 Cached

This paper introduces compatibility and incompatibility scores for evaluating collections of bivariate causal statements without relying on faithfulness, and demonstrates their applicability by analyzing causal claims from large language models.

0 favorites 0 likes

#causal-reasoning

BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation

arXiv cs.AI ↗ · 2026-05-29 Cached

The BEAMS Initiative presents a benchmark suite for evaluating AI tools in modeling and simulation, focusing on human-centered and responsible AI practices. Tests reveal variability across LLM-based engines, with better performance in qualitative tasks than causal reasoning.

0 favorites 0 likes

#causal-reasoning

SVI-Bench: A Dynamic Microworld for Strategic Video Intelligence

Hugging Face Daily Papers ↗ · 2026-05-29 Cached

Introduces SVI-Bench, a large-scale benchmark for strategic video intelligence using team sports, designed to evaluate models on dynamic scene understanding, causal reasoning, strategic simulation, and agentic synthesis. The benchmark reveals a capability cliff where models perform well on perceptual tasks but sharply degrade on higher-level strategic reasoning.

0 favorites 0 likes

#causal-reasoning

Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform

arXiv cs.AI ↗ · 2026-05-26 Cached

This paper argues that large language models struggle with causal reasoning and long-horizon planning due to a mismatch between sequence prediction and reasoning over latent environment dynamics, and introduces the Latent Dynamics Inference perspective along with the Flux environment to study these limitations.

0 favorites 0 likes

#causal-reasoning

Why scaling alone will not give us rational AI

Reddit r/ArtificialInteligence ↗ · 2026-05-18

This article argues that fundamental architectural limitations, not scaling deficits, prevent current LLMs from achieving true rationality—the ability to recognize and switch frames—citing empirical failures like the reversal curse and frame-transfer issues, and suggests that scaling alone may not bridge this gap.

0 favorites 0 likes

#causal-reasoning

Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

arXiv cs.AI ↗ · 2026-05-18 Cached

该论文提出并评估了一类称为事件图基质的因果推理世界模型，通过确定性重放在类型化RDF事件日志上进行反事实查询，在多个基准上优于基线模型，同时保证了可检查性和可重放一致性。

0 favorites 0 likes

#causal-reasoning

@berryxia: Folks, Google's latest paper has completely flipped the fundamental logic of time series forecasting. All previous models were stuck on historical data: whatever the curve does, predict that. Nexus says: prediction needs not just history, but "event context". The real reasons behind the numbers—policies, sudden events, macro trends, local shocks...

X AI KOLs Timeline ↗ · 2026-05-18 Cached

Google's new paper Nexus proposes transforming time series forecasting from statistical extrapolation to multi-agent reasoning, improving prediction accuracy via event context, achieving an 86.6% reduction in MAPE on the Zillow dataset.

0 favorites 0 likes

#causal-reasoning

ReplaySCM: A Benchmark for Executable Causal Mechanism Induction from Interventions

arXiv cs.LG ↗ · 2026-05-12 Cached

This article introduces ReplaySCM, a benchmark designed to evaluate language models' ability to induce executable causal mechanisms from interventional evidence, focusing on semantic replay behavior rather than syntactic matches.

0 favorites 0 likes

#causal-reasoning

On Semantic Loss Fine-Tuning Approach for Preventing Model Collapse in Causal Reasoning

arXiv cs.LG ↗ · 2026-05-08 Cached

This paper identifies a critical 'model collapse' issue in standard fine-tuning for causal reasoning and proposes a semantic loss function with graph-based logical constraints to prevent it.

0 favorites 0 likes

causal-reasoning

Submit Feedback