The Scaling Properties of Implicit Deductive Reasoning in Transformers
Summary
This research examines how deep Transformers with bidirectional masking achieve implicit deductive reasoning comparable to explicit chain-of-thought methods. The study demonstrates that algorithmically aligned models can scale reasoning capabilities across diverse graph topologies and problem widths.
View Cached Full Text
Cached at: 05/08/26, 02:27 PM
Paper page - The Scaling Properties of Implicit Deductive Reasoning in Transformers
Source: https://huggingface.co/papers/2605.04330 Published on May 5
·
Submitted byhttps://huggingface.co/envomp
Enricoon May 8
Abstract
Deep Transformers with bidirectional masking exhibit implicit deductive reasoning capabilities comparable to explicit chain-of-thought methods across various graph structures and problem sizes.
We investigate the scaling properties ofimplicit deductive reasoningoverHorn clausesindepth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcingalgorithmic alignment, we find that in sufficiently deep models with abidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.04330
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.04330 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.04330 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.04330 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@machinestein: ICML 2026: Latent Reasoning in TRMs is Secretly a Policy Improvement Operator Why does recursive reasoning, especially …
The paper reveals that latent reasoning in transformer-based reasoning models (TRMs) functions as a policy improvement operator, and proposes an algorithm that enhances learning and inference efficiency by up to 18x.
The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason
A comprehensive spectral analysis across 11 LLMs revealing that transformers exhibit phase transitions in hidden activation spaces during reasoning versus factual recall, with seven fundamental phenomena including spectral compression, instruction-tuning reversal, and perfect correctness prediction (AUC=1.0) based solely on spectral properties.
Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning
Proposes ProxyCoT, a training framework that improves long-context reasoning in large language models by first obtaining chain-of-thought reasoning traces on short proxy contexts (via reinforcement learning or distillation) and then grounding them in full long contexts through supervised fine-tuning. Experiments show consistent improvements over baselines with reduced computational cost.
Transformers Linearly Represent Highly Structured World Models
This paper demonstrates that transformers trained on Sudoku solving traces build structured world models organized by domain constraints, and identifies a sparse, monosemantic circuit responsible for the naked-single decision rule. The work provides a fully interpretable algorithmic account of transformer reasoning on a combinatorial task.
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
This paper introduces Deep Reasoning, an inference-time approach that uses structured meta-reasoning to construct task-specific scaffolds for general-purpose agents. The proposed agent, Dolores, outperforms existing methods by distributing cognition across lower-load reasoning threads, reducing hallucinations and improving performance across multiple benchmarks.