reasoning-optimization

Tag

Cards List
#reasoning-optimization

Thoughts-as-Planning: Latent World Models for Chain-of-Thoughts Optimization via Reinforcement Planning

arXiv cs.CL · 2026-05-29 Cached

Introduces Thoughts-as-Planning, a framework that models chain-of-thought optimization as sequential decision-making using latent world models and reinforcement learning, outperforming existing methods in efficiency and generalization.

0 favorites 0 likes
← Back to home

Submit Feedback