Tag
The article explores reinforcement learning fine-tuning of small (4B) recursive language models (RLMs) to perform evidence selection from scientific documents, showing that RL-trained 4B models match Claude Sonnet 4.6 performance at a fraction of the size and cost.
The article discusses anti-AI propaganda efforts and highlights recent AI industry news including Nvidia's open-source move, OpenAI's deal with Amazon, Grok's video price cuts, and recursive language models.