Tag
A new feature called OpenResearch allows reproducing and experimenting on papers, with a one-click template to train Vector Policy Optimization (VPO) on ToolRL, enabling diverse answer generation and improved test-time search.