counterfactual-learning

Tag

Cards List
#counterfactual-learning

An Introduction to Causal Reinforcement Learning

arXiv cs.AI · 18h ago Cached

This paper introduces causal reinforcement learning (CRL), unifying causal inference and reinforcement learning under a structural causal model framework, and explores novel learning settings such as generalized policy learning and counterfactual learning.

0 favorites 0 likes
#counterfactual-learning

@Ankur_Samanta_: New work on credit assignment in multi-step reasoning RL post-training Introducing Self-Reset Policy Optimization (SRPO…

X AI KOLs Timeline · 2d ago Cached

Self-Reset Policy Optimization (SRPO) addresses credit assignment in multi-step reasoning RL post-training by localizing the first wrong reasoning step and learning from counterfactual continuations without external supervision.

0 favorites 0 likes
← Back to home

Submit Feedback