multi-policy

Tag

Cards List
#multi-policy

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs

Hugging Face Daily Papers · 2026-05-15 Cached

AstraFlow is a dataflow-oriented RL system that enables efficient multi-policy collaborative training and elastic scaling for agentic LLMs, achieving a 2.7x training speedup over existing systems.

0 favorites 0 likes
← Back to home

Submit Feedback