training-data-quality

Tag

Cards List
#training-data-quality

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

arXiv cs.AI · 2026-05-29 Cached

This paper investigates a harmful phenomenon in long chain-of-thought (CoT) training traces where post-conclusion continuation reduces training utility, and proposes a diagnostic method called HarmfulContinuationCut (HCC) to detect such harmful continuations.

0 favorites 0 likes
← Back to home

Submit Feedback