Tag
This paper introduces BeliefMem, a novel memory paradigm for LLM agents that stores multiple candidate conclusions with probabilities to handle partial observability and reduce self-reinforcing errors. Empirical evaluations show it outperforms deterministic baselines on LoCoMo and ALFWorld benchmarks.
This paper introduces Neural Co-state Policies, establishing a formal link between recurrent reinforcement learning hidden states and the Pontryagin minimum principle to improve interpretability and robustness.