Tag
Introduces Future-L1, an interleaved latent visual reasoning framework that improves video event prediction by maintaining visual semantics in latent space. Achieves state-of-the-art results on FutureBench and TwiFF-Bench benchmarks.