panoramic-video

Tag

Cards List
#panoramic-video

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Hugging Face Daily Papers · 2026-05-29 Cached

SwanSphere proposes a unified streaming framework for high-fidelity spatial audio generation from panoramic videos and text prompts using causal autoregressive diffusion transformers and multimodal learning strategies, achieving superior performance in both video-to-spatial and text-to-spatial audio tasks.

0 favorites 0 likes
← Back to home

Submit Feedback