The Scaling Properties of Implicit Deductive Reasoning in Transformers

Hugging Face Daily Papers 05/05/26, 12:00 AM Papers

Summary

This research examines how deep Transformers with bidirectional masking achieve implicit deductive reasoning comparable to explicit chain-of-thought methods. The study demonstrates that algorithmically aligned models can scale reasoning capabilities across diverse graph topologies and problem widths.

We investigate the scaling properties of implicit deductive reasoning over Horn clauses in depth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcing algorithmic alignment, we find that in sufficiently deep models with a bidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.

Original Article

View Cached Full Text

Cached at: 05/08/26, 02:27 PM

Paper page - The Scaling Properties of Implicit Deductive Reasoning in Transformers

Source: https://huggingface.co/papers/2605.04330 Published on May 5

Submitted byhttps://huggingface.co/envomp

Enricoon May 8

Abstract

Deep Transformers with bidirectional masking exhibit implicit deductive reasoning capabilities comparable to explicit chain-of-thought methods across various graph structures and problem sizes.

We investigate the scaling properties ofimplicit deductive reasoningoverHorn clausesindepth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcingalgorithmic alignment, we find that in sufficiently deep models with abidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.04330

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.04330 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.04330 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.04330 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

The Scaling Properties of Implicit Deductive Reasoning in Transformers

Paper page - The Scaling Properties of Implicit Deductive Reasoning in Transformers

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

@machinestein: ICML 2026: Latent Reasoning in TRMs is Secretly a Policy Improvement Operator Why does recursive reasoning, especially …

The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning

Transformers Linearly Represent Highly Structured World Models

Deep Reasoning in General Purpose Agents via Structured Meta-Cognition

Submit Feedback

Similar Articles

@machinestein: ICML 2026: Latent Reasoning in TRMs is Secretly a Policy Improvement Operator Why does recursive reasoning, especially …

The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning

Transformers Linearly Represent Highly Structured World Models

Deep Reasoning in General Purpose Agents via Structured Meta-Cognition