interactive-reasoning

#interactive-reasoning

Evaluating Interactive Reasoning in Large Language Models: A Hierarchical Benchmark with Executable Games

arXiv cs.AI ↗ · yesterday Cached

This paper introduces a multi-turn interactive framework for reasoning evaluation where LLMs must query a hidden environment and integrate partial observations, instantiated as a benchmark of 474 executable games across five difficulty levels, showing discriminative power and exposing differences in reasoning.

0 favorites 0 likes

#interactive-reasoning

HypoAgent: An Agentic Framework for Interactive Abductive Hypothesis Generation over Knowledge Graphs

arXiv cs.AI ↗ · 2d ago Cached

HypoAgent is an agentic framework for interactive abductive hypothesis generation over knowledge graphs, integrating three agents to handle evolving user intents and fine-grained diagnosis, achieving state-of-the-art performance.

0 favorites 0 likes

interactive-reasoning

Evaluating Interactive Reasoning in Large Language Models: A Hierarchical Benchmark with Executable Games

HypoAgent: An Agentic Framework for Interactive Abductive Hypothesis Generation over Knowledge Graphs

Submit Feedback