Tag
HYDRA-X presents a unified multimodal model that integrates image and video tokenization within a single Vision Transformer, achieving strong performance across understanding and generation tasks.
This paper proposes a history-bootstrapped autoregressive flow matching method for reconstructing full spatiotemporal fields (velocity and temperature) from partial observations of boiling dynamics, addressing the ill-posed inverse problem with non-Markovian posterior.