Tag
The author shares his experience using RLMs for daily tasks like coding, processing multi-million-token logs, and browser automation, and releases it as an open-source Python package installable via pip.
HALO is an open-source desktop app that uses reinforcement learning from model-based (RLM) techniques to debug and optimize AI agent traces locally, providing analysis and actionable recommendations.
Ax is an open-source TypeScript library that implements DSPy-style typed signatures and agent frameworks for building reliable AI applications with minimal prompting. It supports multiple LLM providers and includes features like agents, flows, RAG, and self-improving pipelines.
fast-rlm enables reinforcement learning models to access MCP servers via stdio or HTTP, allowing tool use and resource fetching with results saved as Python variables in the REPL to save input tokens.
A tweet praising the combination of RLMs and GEPA, expressing anticipation for a follow-up.
New `fast-rlm` update introduces REPL Tool Calling, allowing agents to invoke Python functions via REPL with outputs stored in variables. Demo shows web search and Goodreads review integration.
The article explores reinforcement learning fine-tuning of small (4B) recursive language models (RLMs) to perform evidence selection from scientific documents, showing that RL-trained 4B models match Claude Sonnet 4.6 performance at a fraction of the size and cost.
This update to the RLM arXiv paper adds depth>1 experiments with recursive RLM calls, showing significant performance gains on OOLONG-Pairs and other benchmarks, along with new comparisons to OpenCode and Claude Code, additional training results on MRCRv2, and an expanded error analysis.
Isaac Flath predicts RLM will revive notebooks by enabling agents to drive REPLs with interleaved prose.
A social media post highlighting a writeup on applying RLM and DSPy to multi-modal data.
LongCoT introduces two new agent leaderboards (Restricted & Open Harness), with GPT 5.2 RLM topping the Open Harness at 25.12%.
A researcher comments on the simplicity and elegance of the RLM paper, comparing it to the influential ReAct paper and expressing appreciation for its straightforward approach to solving general problems.
A developer shares their experience with Recurrent Language Models (RLMs), claiming they effectively handle extremely long context windows with tens of millions of tokens, representing a significant advancement in context handling capabilities.