Tag
A new method called ECHO bridges RL and pre-training by using next token prediction on tool call outputs to learn from the environment beyond reward signals, combining world modeling and agentic actions.