Tag
A new 30B model matches systems 20-30x its size on popular benchmarks while using up to 95% fewer reasoning tokens than comparable agentic LLMs, achieved through a learned configurator that decides when and how to reason. Model and code are openly available.
AstraFlow is a dataflow-oriented RL system that enables efficient multi-policy collaborative training and elastic scaling for agentic LLMs, achieving a 2.7x training speedup over existing systems.
HAGE introduces a weighted multi-relational memory framework that enables query-conditioned traversal over unified relational memory graphs, improving long-horizon reasoning accuracy through adaptive memory retrieval and reinforcement learning-based optimization.