Tag
PolitNuggets is a multilingual benchmark for evaluating large reasoning models within agentic frameworks on their ability to discover and synthesize long-tail political facts by constructing biographies for 400 global elites. The benchmark introduces evaluation protocols like FactNet and reveals that current systems struggle with fine-grained details and efficiency.
This paper investigates safety failures in Large Reasoning Models where harmful content appears in reasoning traces despite safe final answers, proposing an adaptive multi-principle steering method to mitigate these risks.
CiPO is a novel framework for machine unlearning in Large Reasoning Models that uses iterative preference optimization with counterfactual reasoning traces to selectively remove unwanted knowledge while preserving reasoning abilities. The method addresses the challenge of unlearning in models that rely on chain-of-thought reasoning by generating logically valid alternative reasoning paths during training.