Tag
MaineCoon is a 22B-parameter real-time audio-visual autoregressive model for social world modeling, capable of streaming generation at up to 47.5 FPS on a single GPU, introducing novel training techniques and an agentic inference framework.
StreamChar is a streaming framework for real-time audio-video generation of character animation, using an LLM orchestrator and joint audio-video DiT with two-stage distillation and memory mechanisms to maintain long-horizon consistency and visual quality.