multi-step-reasoning

#multi-step-reasoning

Hierarchical Denoising For Multi-Step Visual Reasoning

Hugging Face Daily Papers ↗ · 4d ago Cached

HDR is a unified framework integrating hierarchical latents into causal video generation for multi-step visual reasoning, achieving better reasoning consistency, lower latency, and strong data efficiency compared to baselines.

0 favorites 0 likes

#multi-step-reasoning

@ando_w: https://x.com/ando_w/status/2075468963098546520

X AI KOLs Timeline ↗ · 2026-07-10 Cached

This article introduces how to upgrade single-turn RAG to Agentic RAG, by allowing the LLM to autonomously decide on multiple retrievals and tool calls to solve multi-step reasoning for complex problems. It provides code examples and implementation ideas based on Qwen3.7-Max.

0 favorites 0 likes

#multi-step-reasoning

When Search Agents Should Ask: DiscoBench for Clarification-Aware Deep Search

arXiv cs.CL ↗ · 2026-06-29 Cached

DiscoBench is a new benchmark that evaluates whether LLM-powered search agents can proactively identify ambiguity in user queries, ask clarifying questions, and recover correct reasoning paths through multi-turn interaction.

0 favorites 0 likes

#multi-step-reasoning

@rohanpaul_ai: The model ("Owl Alpha") is designed for agentic workloads: - tool calling - multi-step reasoning - long-context executi…

X AI KOLs Following ↗ · 2026-06-28 Cached

Owl Alpha is a new model designed for agentic workloads including tool calling, multi-step reasoning, long-context execution, code generation, automated workflows, and DevOps tasks.

0 favorites 0 likes

#multi-step-reasoning

Investigating LLM's Problem Solving Capability -- a Study on Statics Questions

arXiv cs.CL ↗ · 2026-06-26 Cached

This paper evaluates LLM performance on statics problems, finding that while text-only questions are handled well, accuracy drops with diagrams and multi-step reasoning, suggesting difficulties in applying visual information consistently.

0 favorites 0 likes

#multi-step-reasoning

@Ankur_Samanta_: New work on credit assignment in multi-step reasoning RL post-training Introducing Self-Reset Policy Optimization (SRPO…

X AI KOLs Timeline ↗ · 2026-06-22 Cached

Self-Reset Policy Optimization (SRPO) addresses credit assignment in multi-step reasoning RL post-training by localizing the first wrong reasoning step and learning from counterfactual continuations without external supervision.

0 favorites 0 likes

#multi-step-reasoning

Learning to Refine Hidden States for Reliable LLM Reasoning

arXiv cs.LG ↗ · 2026-06-17 Cached

Proposes ReLAR, a reinforcement-guided latent refinement framework that iteratively updates hidden representations in LLMs before decoding, improving reasoning reliability and efficiency compared to chain-of-thought methods.

0 favorites 0 likes

#multi-step-reasoning

Which Models Perform Better in Inheritance Reasoning?

arXiv cs.CL ↗ · 2026-06-15 Cached

This paper presents the participation of team PSL in the QIAS 2026 Shared Task on Arabic Islamic inheritance reasoning, comparing commercial and open-source large language models. Results show commercial models (e.g., Gemini 2.5 Flash) significantly outperform open-source models in structured legal reasoning with multi-step dependencies.

0 favorites 0 likes

#multi-step-reasoning

Stepwise Reasoning Enhancement for LLMs via External Subgraph Generation

arXiv cs.CL ↗ · 2026-06-04 Cached

This paper proposes SGR, a framework that enhances LLM stepwise reasoning by integrating external knowledge graphs through query-relevant subgraph generation, combining Cypher-based reasoning with collaborative reasoning integration. Experiments on CWQ, WebQSP, GrailQA, and KQA Pro show improved reasoning accuracy over standard prompting and knowledge-enhanced baselines.

0 favorites 0 likes

#multi-step-reasoning

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

arXiv cs.AI ↗ · 2026-06-04 Cached

This paper introduces CHARM, a framework for detecting and mitigating cascading hallucinations in multi-step agentic RAG pipelines, where early-stage errors propagate and amplify across reasoning steps. CHARM achieves an 89.4% cascade detection rate and 82.1% error propagation reduction across multiple benchmarks with low latency overhead.

0 favorites 0 likes

#multi-step-reasoning

Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval

arXiv cs.AI ↗ · 2026-06-04 Cached

This paper proposes SGDR (State-Grounded Dynamic Retrieval), an online skill learning method for web agents that enables stepwise, state-aware skill reuse rather than static task-level retrieval. Experiments on WebArena show SGDR achieves 37.5% success rate with GPT-4.1, a ~10.6% relative gain over strong baselines.

0 favorites 0 likes

#multi-step-reasoning

LLMs are not the black box you were promised

Hacker News Top ↗ · 2026-06-02 Cached

An article summarizing Anthropic's 2025 paper on mechanistic interpretability, showing that LLMs are not black boxes and that circuit tracing can reveal multi-step reasoning and human-identifiable concepts.

0 favorites 0 likes

#multi-step-reasoning

HyperGuide: Hyperbolic Guidance for Efficient Multi-Step Reasoning in Large Language Models

arXiv cs.AI ↗ · 2026-05-26 Cached

This paper proposes HyperGuide, a method that distills reasoning progress into a hyperbolic geometric signal to guide step-by-step generation in LLMs, improving multi-step reasoning efficiency without explicit tree search.

0 favorites 0 likes

#multi-step-reasoning

Reinforcement Learning for Tool-Calling Agents in Fast Healthcare Interoperability Resources (FHIR)

arXiv cs.LG ↗ · 2026-05-15 Cached

This paper presents a reinforcement learning post-training pipeline for tool-calling LLM agents operating on FHIR healthcare data, achieving a 77% answer correctness on FHIR-AgentBench using a smaller Qwen3-8B model compared to 50% with o4-mini.

0 favorites 0 likes

#multi-step-reasoning

Introducing GPT-5.5 with Box

YouTube AI Channels ↗ · 2026-05-08 Cached

GPT-5.5 brings a 19 percentage point improvement in multi-step reasoning and financial modeling, significantly reducing the burden of knowledge work, which excites the Box team.

0 favorites 0 likes

multi-step-reasoning

Submit Feedback