video-backbone

#video-backbone

Light-WAM: Efficient World Action Models with State-Fusion Action Decoding

Hugging Face Daily Papers ↗ · 2026-06-06 Cached

Light-WAM is a lightweight world action model for efficient robot manipulation that uses a compact video backbone and downsampled latent space for future-video supervision, achieving high performance with low inference latency.

0 favorites 0 likes

video-backbone

Light-WAM: Efficient World Action Models with State-Fusion Action Decoding

Submit Feedback