latent-reasoning

#latent-reasoning

Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning

arXiv cs.AI ↗ · 21h ago Cached

This paper identifies a 'concept bottleneck' in the CoCoNuT latent reasoning paradigm where hidden states are overwritten across passes, and proposes AGCLR, which adds a gated persistent memory stream to retain intermediate facts. Evaluations on GSM8K, HotpotQA, and ProsQA using GPT-2 show consistent improvements, especially on multi-hop tasks.

0 favorites 0 likes

#latent-reasoning

The strange thing about LLM reasoning research: we're now trying to remove the chain-of-thought traces

Reddit r/artificial ↗ · 4d ago

The article discusses a shift in LLM reasoning research from making reasoning explicit via chain-of-thought to exploring latent reasoning that doesn't require language traces, questioning whether visibility is necessary for effective reasoning.

0 favorites 0 likes

#latent-reasoning

MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models

arXiv cs.AI ↗ · 5d ago Cached

MIRAGE is a framework for mobile GUI agents that replaces verbose chain-of-thought reasoning with compact continuous latent representations, incorporating a generative world model perspective to predict future screen states before acting. On AndroidWorld and AndroidControl benchmarks, it achieves competitive or superior performance while reducing generated tokens by over 75%.

0 favorites 0 likes

#latent-reasoning

Latent Reasoning with Normalizing Flows

Hugging Face Daily Papers ↗ · 6d ago Cached

Proposes NF-CoT, a latent reasoning framework using normalizing flows to model continuous thoughts in LLMs, preserving autoregressive advantages and achieving better code generation performance with lower cost.

0 favorites 0 likes

#latent-reasoning

Adaptive Latent Agentic Reasoning

arXiv cs.CL ↗ · 6d ago Cached

This paper introduces Adaptive Latent Agentic Reasoning (ALAR), a dual-mode framework for LLM agents that uses compact latent reasoning for routine turns and selectively escalates to explicit chain-of-thought for harder decisions, achieving up to 84.6% token reduction while maintaining task accuracy.

0 favorites 0 likes

#latent-reasoning

LaSR: Context-Aware Speech Recognition via Latent Reasoning

arXiv cs.CL ↗ · 2026-06-02 Cached

LaSR proposes a latent reasoning training paradigm for context-aware speech recognition, aligning chain-of-thought supervision around acoustic features to improve terminology recognition without added latency, outperforming standard fine-tuning on Fun-Audio-Chat.

0 favorites 0 likes

#latent-reasoning

Geometric Latent Reasoning Induces Shorter Generations in LLMs

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

Geometric Latent Reasoning (GLR) introduces a geometric path-approximation method for latent reasoning in LLMs, enabling shorter generations while maintaining accuracy across mathematical reasoning benchmarks.

0 favorites 0 likes

#latent-reasoning

Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling

Reddit r/LocalLLaMA ↗ · 2026-05-31 Cached

This paper introduces Semantic Step Prediction, which applies geometric regularization at reasoning step boundaries rather than random token positions, achieving 168× better multi-step latent forecasting on ProcessBench compared to frozen baselines.

0 favorites 0 likes

#latent-reasoning

Robust and Efficient Guardrails with Latent Reasoning

arXiv cs.AI ↗ · 2026-05-29 Cached

CoLaGuard is a new guardrail model that transfers multi-step safety reasoning into a continuous latent space, achieving 12.9x speedup and 22.4x token reduction compared to explicit reasoning baselines while matching macro-F1 performance on ten safety benchmarks.

0 favorites 0 likes

#latent-reasoning

Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning with Vision-Language Models

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper investigates whether multimodal large language models (MLLMs) can leverage Miller indices as a latent representation to reason about crystallographic fracture geometry from visual inputs, evaluating their ability to infer physically valid plane hypotheses and determine when such representation is applicable across materials like ceramics, glass, metals, and concrete.

0 favorites 0 likes

#latent-reasoning

TTE-Flash: Accelerating Reasoning-based Multimodal Representations via Think-Then-Embed Tokens

arXiv cs.AI ↗ · 2026-05-19 Cached

The paper introduces TTE-Flash, a method that replaces explicit chain-of-thought reasoning with latent think tokens to generate reasoning-aware multimodal representations at constant inference cost, outperforming explicit CoT baselines on the MMEB-v2 benchmark.

0 favorites 0 likes

#latent-reasoning

Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning

arXiv cs.AI ↗ · 2026-05-18 Cached

LaMR introduces a structured pruning framework for coding agents that decomposes code relevance into semantic evidence and dependency support dimensions, using dedicated CRFs and a mixture-of-experts gate to reduce token usage by up to 31% while maintaining or improving task performance.

0 favorites 0 likes

#latent-reasoning

When Does a Language Model Commit? A Finite-Answer Theory of Pre-Verbalization Commitment

arXiv cs.AI ↗ · 2026-05-11 Cached

This research paper proposes a finite-answer theory to analyze when language models commit to an answer before verbalizing it. Using Qwen3-4B-Instruct, the authors demonstrate that answer preference stabilizes significantly before the final output is generated, offering insights into latent reasoning and model internal states.

0 favorites 0 likes

#latent-reasoning

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

arXiv cs.CL ↗ · 2026-05-08 Cached

LatentRAG is a novel framework that shifts reasoning and retrieval for agentic RAG into continuous latent space, reducing inference latency by approximately 90% while maintaining performance comparable to explicit methods.

0 favorites 0 likes

#latent-reasoning

Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper investigates multilingual latent reasoning in large reasoning models across 11 languages, revealing that while latent reasoning capabilities exist, they are unevenly distributed—stronger in resource-rich languages and weaker in low-resource ones. The study finds that despite surface-level differences, the internal reasoning mechanisms are largely aligned with an English-centered pathway.

0 favorites 0 likes

#latent-reasoning

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Hugging Face Daily Papers ↗ · 2026-04-20 Cached

OneVL is a unified vision-language-action framework that compresses chain-of-thought reasoning into latent tokens supervised by both language and visual world model decoders, achieving state-of-the-art trajectory prediction accuracy for autonomous driving at answer-only inference latency. It is the first latent CoT method to surpass explicit CoT across four benchmarks.

0 favorites 0 likes

latent-reasoning

Submit Feedback