process-supervision

Tag

Cards List
#process-supervision

Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning

arXiv cs.LG · 2d ago Cached

Introduces IOP, a framework that internalizes outcome supervision into process supervision for reasoning reinforcement learning, enabling fine-grained credit assignment without external annotations.

0 favorites 0 likes
#process-supervision

ATTNPO: Attention-Guided Process Supervision for Efficient Reasoning

arXiv cs.CL · 2026-04-20 Cached

ATTNPO introduces an attention-guided process supervision framework that reduces overthinking in large reasoning models by leveraging intrinsic attention signals for step-level credit assignment, achieving improved performance with shorter reasoning lengths across 9 benchmarks.

0 favorites 0 likes
#process-supervision

Improving mathematical reasoning with process supervision

OpenAI Blog · 2023-05-31 Cached

OpenAI demonstrates that process supervision—rewarding intermediate reasoning steps rather than just final answers—improves mathematical reasoning while reducing alignment costs. This approach produces more interpretable, human-aligned reasoning without sacrificing model performance.

0 favorites 0 likes
← Back to home

Submit Feedback