Tag
Introduces LatentGym, a controllable testbed for studying cross-task experiential learning in LLM agents, enabling measurement of exploration vs exploitation and revealing how frontier models fail to adapt across related tasks.