Tag
Proposes and compares two mathematical formulations for robust microgrid sizing and power scheduling under uncertainties, using a local reduction algorithm that achieves high feasibility rates in Monte Carlo simulations.
Trust Region Q-Adjoint Matching (TRQAM) addresses instability in off-policy reinforcement learning by adaptively controlling path-space KL divergence through projected dual descent, enabling stable fine-tuning of pretrained flow policies. The method consistently outperforms prior arts on 50 OGBench tasks, achieving a 68% success rate in offline RL compared to the strongest baseline's 46%.
This paper reformulates language generation as a stochastic optimal control problem, addressing limitations of autoregressive and diffusion models, and proposes a closed-loop diffusion method in latent control space using Flow Matching, achieving high-fidelity generation and efficient parallel sampling.
This paper introduces Neural Co-state Policies, establishing a formal link between recurrent reinforcement learning hidden states and the Pontryagin minimum principle to improve interpretability and robustness.