vlm-agents

#vlm-agents

GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents

arXiv cs.LG ↗ · 2026-05-21 Cached

GROW proposes a novel reinforcement learning framework that adapts GRPO to multi-turn VLM agent tasks by decomposing trajectories into state-action pairs and computing advantages between them, achieving state-of-the-art performance on over 800 Minecraft tasks.

0 favorites 0 likes

#vlm-agents

AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents

Hugging Face Daily Papers ↗ · 2026-05-18 Cached

AtlasVA is a teacher-free visual skill memory framework for vision-language model agents that uses spatial heatmaps, visual exemplars, and symbolic text skills to improve spatial decision-making in long-horizon tasks, outperforming baselines on several benchmarks.

0 favorites 0 likes

vlm-agents

GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents

AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents

Submit Feedback