process-reward-model

#process-reward-model

From Long News to Accurate Forecast: Importance-Aware Fusion and PRM-Guided Reflection for Time Series Forecasting

arXiv cs.AI ↗ · yesterday Cached

This paper introduces a framework for time series forecasting that uses importance-aware news compression and process reward model-guided retrieval to incorporate long news articles within fixed context limits, improving prediction accuracy across finance, energy, traffic, and Bitcoin benchmarks.

0 favorites 0 likes

#process-reward-model

Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents

arXiv cs.CL ↗ · 2d ago Cached

This paper proposes MERIT, a dynamic multi-horizon memory retrieval framework for interactive text-to-SQL agents that uses episode-level and turn-level memory with learned retrieval policies optimized via reinforcement learning and a process reward model for dense rewards. Experiments on BIRD-Interact and Spider2-Snow show that MERIT outperforms static and single-horizon dynamic baselines in success rate while requiring fewer interaction turns.

0 favorites 0 likes

#process-reward-model

Process Rewards with Learned Reliability

arXiv cs.CL ↗ · 2026-05-18 Cached

BetaPRM is a process reward model that predicts both a step-level success probability and the reliability of that prediction using a Beta belief from Monte Carlo continuations, enabling adaptive computation allocation that reduces token usage by up to 33.57% while improving accuracy.

0 favorites 0 likes

process-reward-model

From Long News to Accurate Forecast: Importance-Aware Fusion and PRM-Guided Reflection for Time Series Forecasting

Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents

Process Rewards with Learned Reliability

Submit Feedback