llm-planning

Tag

Cards List
#llm-planning

UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems

arXiv cs.CL · 5d ago Cached

This paper proposes UP-NRPA, an online framework that integrates user portraits with nested rollout policy adaptation using large language models to dynamically customize dialogue strategies without offline training, achieving 100% success on multiple dialogue tasks.

0 favorites 0 likes
#llm-planning

SIMMER: Benchmarking Latent Failures in LLM Executable Planning with a World Model

arXiv cs.CL · 5d ago Cached

Introduces Simmer, a benchmark for evaluating latent failures in LLM-generated executable plans using a human-curated symbolic world model in the kitchen domain. Experiments show frontier LLMs achieve at most 17% error-free plans, with up to 56% containing latent failures, and counterfactual foresight simulation reduces failures significantly.

0 favorites 0 likes
← Back to home

Submit Feedback