Latent Reasoning with Normalizing Flows

Hugging Face Daily Papers 06/04/26, 12:00 AM Papers

Summary

Proposes NF-CoT, a latent reasoning framework using normalizing flows to model continuous thoughts in LLMs, preserving autoregressive advantages and achieving better code generation performance with lower cost.

Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the importance of intermediate computation. However, textual CoT forces this computation through a discrete, serial, and communication-oriented token stream: each reasoning step must be verbalized before the model can proceed, even when the underlying update is semantic, uncertain, or only partially formed. Latent reasoning offers a higher-bandwidth alternative by performing intermediate computation in compact continuous states before committing to text. Yet existing latent-reasoning methods often sacrifice key advantages that make CoT effective in autoregressive language models, including native left-to-right generation, probabilistic sampling, compatibility with KV-cache decoding, and tractable likelihood estimation. We propose NF-CoT, a latent reasoning framework that preserves these advantages by modeling continuous thoughts with normalizing flows. NF-CoT instantiates a TARFlow-style normalizing flow inside the LLM backbone, defining a tractable probability model over compact continuous thoughts distilled from explicit CoT. Continuous-thought positions are generated by an NF head, while text positions are generated by the standard LM head within the same causal stream. This design provides exact likelihoods for latent thoughts, enables probabilistic left-to-right decoding with the original KV cache, and supports direct policy-gradient optimization in the latent reasoning space. On code-generation benchmarks, NF-CoT improves pass rates over explicit-CoT and prior latent-reasoning baselines while substantially reducing intermediate-reasoning cost.

Original Article

View Cached Full Text

Cached at: 06/05/26, 06:07 AM

Paper page - Latent Reasoning with Normalizing Flows

Source: https://huggingface.co/papers/2606.06447

Abstract

Latent reasoning framework using normalizing flows preserves autoregressive generation advantages while enabling efficient, probabilistic intermediate computation in large language models.

Large language models often improve reasoning by generating explicitchain-of-thought(CoT), demonstrating the importance of intermediate computation. However, textual CoT forces this computation through a discrete, serial, and communication-oriented token stream: each reasoning step must be verbalized before the model can proceed, even when the underlying update is semantic, uncertain, or only partially formed.Latent reasoningoffers a higher-bandwidth alternative by performing intermediate computation in compact continuous states before committing to text. Yet existing latent-reasoning methods often sacrifice key advantages that make CoT effective in autoregressive language models, including native left-to-right generation,probabilistic sampling, compatibility withKV-cache decoding, and tractablelikelihood estimation. We propose NF-CoT, alatent reasoningframework that preserves these advantages by modeling continuous thoughts withnormalizing flows. NF-CoT instantiates aTARFlow-style normalizing flow inside the LLM backbone, defining a tractable probability model over compact continuous thoughts distilled from explicit CoT. Continuous-thought positions are generated by an NF head, while text positions are generated by the standard LM head within the same causal stream. This design provides exact likelihoods for latent thoughts, enables probabilistic left-to-right decoding with the original KV cache, and supports directpolicy-gradient optimizationin thelatent reasoningspace. Oncode-generation benchmarks, NF-CoT improves pass rates over explicit-CoT and prior latent-reasoning baselines while substantially reducing intermediate-reasoning cost.

View arXiv page View PDF Project page Add to collection

Get this paper in your agent:

hf papers read 2606\.06447

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.06447 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.06447 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.06447 in a Space README.md to link it from this page.

Latent Reasoning with Normalizing Flows

Paper page - Latent Reasoning with Normalizing Flows

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper1

Similar Articles

Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning

ReasoningFlow: Discourse Structures for Understanding LLM Reasoning Traces

Adaptive Latent Agentic Reasoning

Tools as Continuous Flow for Evolving Agentic Reasoning

NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning

Submit Feedback

Similar Articles

Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning

ReasoningFlow: Discourse Structures for Understanding LLM Reasoning Traces

Adaptive Latent Agentic Reasoning

Tools as Continuous Flow for Evolving Agentic Reasoning

NoisyCoconut: Counterfactual Consensus via Latent Space Reasoning