reasoning-models

Tag

Cards List
#reasoning-models

BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models

arXiv cs.AI · yesterday Cached

This paper introduces BitCal-TTS, a runtime controller that improves accuracy and reduces premature halting in quantized reasoning models by calibrating confidence signals during test-time scaling.

0 favorites 0 likes
#reasoning-models

Evaluation Awareness in Language Models Has Limited Effect on Behaviour

arXiv cs.CL · yesterday Cached

This paper investigates whether verbalized evaluation awareness (VEA) in large reasoning models causally affects their behavior on safety, alignment, moral reasoning, and political opinion benchmarks. The authors find that VEA has limited behavioral impact, with near-zero effects from injecting VEA and small shifts from removing it, suggesting that high VEA rates should not be taken as strong evidence of strategic behavior or alignment tampering.

0 favorites 0 likes
#reasoning-models

@dbreunig: Reasoning models are great at understanding nuance and natural language. This nuance hasn't trickled down to retrieval …

X AI KOLs Following · 3d ago Cached

A tweet highlights that while reasoning models excel at nuance and natural language understanding, this capability hasn't translated to retrieval systems, pointing to a key bottleneck in AI.

0 favorites 0 likes
#reasoning-models

When Safety Fails Before the Answer: Benchmarking Harmful Behavior Detection in Reasoning Chains

arXiv cs.CL · 2026-04-22 Cached

Researchers introduce HarmThoughts, a benchmark with 56,931 annotated sentences from 1,018 reasoning traces to evaluate harmful behavior emergence step-by-step, revealing that current detectors miss nuanced unsafe reasoning transitions.

0 favorites 0 likes
#reasoning-models

TEMPO: Scaling Test-time Training for Large Reasoning Models

Hugging Face Daily Papers · 2026-04-21 Cached

TEMPO introduces a test-time training framework that alternates policy refinement with critic recalibration to prevent diversity collapse and sustain performance gains in large reasoning models, boosting AIME 2024 scores for Qwen3-14B from 42.3% to 65.8%.

0 favorites 0 likes
#reasoning-models

When to Trust Tools? Adaptive Tool Trust Calibration For Tool-Integrated Math Reasoning

arXiv cs.CL · 2026-04-20 Cached

This paper introduces Adaptive Tool Trust Calibration (ATTC), a framework that improves tool-integrated reasoning models by enabling them to adaptively decide when to trust or ignore tool results based on code confidence scores. The approach addresses the "Tool Ignored" problem where models incorrectly dismiss correct tool outputs, achieving 4.1-7.5% performance improvements across multiple models and datasets.

0 favorites 0 likes
#reasoning-models

ATTNPO: Attention-Guided Process Supervision for Efficient Reasoning

arXiv cs.CL · 2026-04-20 Cached

ATTNPO introduces an attention-guided process supervision framework that reduces overthinking in large reasoning models by leveraging intrinsic attention signals for step-level credit assignment, achieving improved performance with shorter reasoning lengths across 9 benchmarks.

0 favorites 0 likes
#reasoning-models

Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

arXiv cs.CL · 2026-04-20 Cached

This paper investigates multilingual latent reasoning in large reasoning models across 11 languages, revealing that while latent reasoning capabilities exist, they are unevenly distributed—stronger in resource-rich languages and weaker in low-resource ones. The study finds that despite surface-level differences, the internal reasoning mechanisms are largely aligned with an English-centered pathway.

0 favorites 0 likes
#reasoning-models

Think Multilingual, Not Harder: A Data-Efficient Framework for Teaching Reasoning Models to Code-Switch

arXiv cs.CL · 2026-04-20 Cached

This paper introduces a data-efficient fine-tuning framework for teaching reasoning models to code-switch (mix languages) effectively, demonstrating that strategic code-switching can improve reasoning capabilities for lower-resource languages. The work analyzes code-switching behaviors in large language models across diverse languages, tasks, and domains, then develops interventions to promote beneficial code-switching patterns.

0 favorites 0 likes
#reasoning-models

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

Hugging Face Daily Papers · 2026-03-23 Cached

This paper introduces TESSY, a teacher-student cooperative framework for fine-tuning reasoning models that generates on-policy SFT data by decoupling generation into capability tokens (from teacher) and style tokens (from student), addressing catastrophic forgetting issues when using off-policy teacher data.

0 favorites 0 likes
#reasoning-models

Reasoning models struggle to control their chains of thought, and that’s good

OpenAI Blog · 2026-03-05 Cached

OpenAI researchers study whether reasoning models can deliberately obscure their chain-of-thought to evade monitoring, finding that current models struggle to control their reasoning even when aware of monitoring. They introduce CoT-Control, an open-source evaluation suite with over 13,000 tasks to measure chain-of-thought controllability in reasoning models.

0 favorites 0 likes
#reasoning-models

Evaluating chain-of-thought monitorability

OpenAI Blog · 2025-12-18 Cached

OpenAI researchers introduce a framework and suite of 13 evaluations to systematically measure chain-of-thought monitorability in large language models, finding that monitoring reasoning processes is substantially more effective than monitoring outputs alone, with important implications for AI safety and supervision at scale.

0 favorites 0 likes
#reasoning-models

Introducing gpt-oss-safeguard

OpenAI Blog · 2025-10-29 Cached

OpenAI releases gpt-oss-safeguard, open-weight reasoning models for safety classification tasks available in 120B and 20B sizes under Apache 2.0 license. The models use chain-of-thought reasoning to classify content according to developer-provided policies at inference time, enabling flexible and explainable content moderation.

0 favorites 0 likes
#reasoning-models

gpt-oss-safeguard technical report

OpenAI Blog · 2025-10-29 Cached

OpenAI releases gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, open-weight reasoning models designed for policy-based content classification with full chain-of-thought reasoning. The technical report provides baseline safety evaluations and demonstrates the models' capabilities for content labeling tasks under the Apache 2.0 license.

0 favorites 0 likes
#reasoning-models

OpenAI o3 and o4-mini System Card

OpenAI Blog · 2025-04-16 Cached

OpenAI released system cards for o3 and o4-mini models, which feature advanced reasoning capabilities combined with tool integration (web browsing, Python, image analysis, etc.) and are evaluated under OpenAI's Preparedness Framework v2 for safety in biological, cybersecurity, and AI self-improvement domains.

0 favorites 0 likes
#reasoning-models

Accelerating engineering cycles 20% with OpenAI

OpenAI Blog · 2025-03-06 Cached

Factory launches a Command Center for software development leveraging OpenAI's o1, o3-mini, and GPT-4o reasoning models to accelerate engineering cycles by 20-400%, reduce context switching by 60%, and provide developers with 10+ additional hours per week through AI-powered code understanding and reasoning across the development lifecycle.

0 favorites 0 likes
#reasoning-models

OpenAI o3-mini System Card

OpenAI Blog · 2025-01-31 Cached

OpenAI releases the o3-mini System Card, documenting safety evaluations and risk assessments for their advanced reasoning model trained with reinforcement learning. The model achieves state-of-the-art safety performance on certain benchmarks and is classified as Medium risk overall under OpenAI's Preparedness Framework.

0 favorites 0 likes
#reasoning-models

Trading inference-time compute for adversarial robustness

OpenAI Blog · 2025-01-22 Cached

OpenAI presents evidence that reasoning models like o1 become more robust to adversarial attacks when given more inference-time compute to think longer. The research demonstrates that increased computation reduces attack success rates across multiple task types including mathematics, factuality, and adversarial images, though significant exceptions remain.

0 favorites 0 likes
← Back to home

Submit Feedback