Tag
ByteDance open-sourced Bernini-R, a video diffusion renderer that combines an MLLM-based semantic planner with a DiT-based renderer for unified video generation and editing, achieving top-tier performance on video editing.
Baidu releases ERNIE-Image-Turbo, a distilled text-to-image generation model that achieves fast generation in 8 inference steps while maintaining strong text rendering, instruction following, and structured image generation capabilities.