Tag
Speculative news about possible new ERNIE models from Baidu, hinted at via tweets and an upcoming Baidu Create 2026 event video.
Baidu open-sourced ERNIE-Image, an 8B parameter text-to-image model with commercial-use weights, making it one of the few fully open and fine-tunable alternatives to closed models like Midjourney.
Baidu releases ERNIE-Image, an open-weight text-to-image generation model with 8B parameters built on Diffusion Transformer architecture, achieving state-of-the-art performance among open-weight models with strong capabilities in text rendering, instruction following, and structured image generation.
PaddleOCR-VL is a compact 0.9B vision-language model that achieves state-of-the-art performance in multilingual document parsing and element recognition by integrating NaViT-style dynamic resolution with the ERNIE language model.