Geometric Latent Reasoning Induces Shorter Generations in LLMs

Hugging Face Daily Papers Papers

Summary

Geometric Latent Reasoning (GLR) introduces a geometric path-approximation method for latent reasoning in LLMs, enabling shorter generations while maintaining accuracy across mathematical reasoning benchmarks.

Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. While latent reasoning offers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulate latent reasoning as a geometric path-approximation problem within the model's pretrained token-embedding space. We introduce Geometric Latent Reasoning (GLR), which uses a lightweight transition head to predict iterative direction updates in embedding space. Using textual chain-of-thought traces as anchors, GLR learns to approximate discrete reasoning trajectories while permitting continuous deviations from exact token embeddings. Evaluations on mathematical reasoning benchmarks using Qwen3 models reveal an emergent phenomenon: geometric latent reasoning induces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff between latent computation budget, output length, and accuracy.
Original Article
View Cached Full Text

Cached at: 06/02/26, 03:36 PM

Paper page - Geometric Latent Reasoning Induces Shorter Generations in LLMs

Source: https://huggingface.co/papers/2606.02248

Abstract

Geometric Latent Reasoning formulates latent reasoning as a geometric path-approximation problem in pretrained token-embedding space, enabling continuous intermediate reasoning states that reduce generation length while maintaining accuracy.

Large language models solve complex problems by generating lengthy chains of explicit reasoning tokens. While effective, this makes reasoning expensive, length-sensitive, and constrained to (discrete) natural language. Whilelatent reasoningoffers a continuous alternative, determining useful structures for intermediate latent states is an open challenge. In this paper, we formulatelatent reasoningas ageometric path-approximation problemwithin the model’spretrained token-embedding space. We introduce GeometricLatent Reasoning(GLR), which uses a lightweighttransition headto predict iterative direction updates in embedding space. Using textualchain-of-thought tracesas anchors, GLR learns to approximatediscrete reasoning trajectorieswhile permittingcontinuous deviationsfrom exact token embeddings. Evaluations onmathematical reasoning benchmarksusingQwen3 modelsreveal anemergent phenomenon: geometriclatent reasoninginduces substantially shorter generations without an explicit length objective. By replacing early explicit reasoning with continuous latent steps, models often reach correct answers using substantially fewer total generation steps. These findings suggest that continuous trajectories act as compact intermediate reasoning states, exposing a new tradeoff betweenlatent computation budget,output length, andaccuracy.

View arXiv pageView PDFAdd to collection

Get this paper in your agent:

hf papers read 2606\.02248

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.02248 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.02248 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.02248 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Hugging Face Daily Papers

This paper introduces ScaleLogic, a framework demonstrating that RL training compute scales as a power law with reasoning depth in LLMs. It highlights that logical expressiveness is key to improving downstream transfer and training efficiency.