Tag
Krea 2 is a 12-billion parameter text-to-image diffusion model released open-weight on Hugging Face, with Raw (base) and Turbo (post-trained) checkpoints available.
Asks for recommendations on affordable AI models for content writing, image generation, and vibe coding.
Boogu has released a series of open-source unified image generation and editing models, including Base, Turbo, and Edit variants.
User presents a comprehensive comparison of local text-to-image models using 192 prompts, evaluating capabilities like text rendering, faces, anatomy, and spatial composition, with results and prompts publicly available at imagebench.ai.
The author details the process of pretraining and post-training a 500M parameter language model and a 330M parameter image generator entirely from scratch.
Thumbmagic is an AI thumbnail generator trained on top-performing thumbnails.
Discovered a creative usage: using an embedded browser to achieve infinite canvas image generation with Codex Image 2.
A user demonstrates giving a local LLM agent MCP tools for local image and video generation, enabling fully offline and free generation on demand.
Shows three years of AI progress: ModelScope on the left, Grok Imagine 1.5 on the right.
Midjourney, known for AI image generation, has developed a new technology that is described as the sequel to the MRI, likely advancing medical imaging capabilities.
FreeStyle proposes a scalable dual-reference generation framework using community LoRA mining to construct large-scale style-content triplets, with disentanglement mechanisms to prevent content leakage, and introduces a comprehensive benchmark for evaluation.
This paper analyzes the variance of FID scores across different training and sampling seeds, revealing significant reproducibility issues in image generation evaluation. It proposes a new evaluation protocol with error bars and per-cell optimal guidance tuning.
A LoRA that adapts Ideogram 4 to generate high-quality images in as few as 2 steps without CFG, using a novel continuous turbo training method.
Comfy-Org has repackaged Boogu-Image model files for ComfyUI, including base, edit, and turbo variants with different quantization formats, plus a LoRA and text encoder.
Google released a new image generation model.
This ICML 2026 spotlight position paper identifies a failure mode in image-generation alignment where aesthetic preference optimization overrides explicit user intent, terming it 'reversed alignment' and testing on anti-aesthetic prompts.
An explanation of why diffusion models work well for images: low-frequency spectral components dominate, so denoising recovers coarse structure first, then fine detail — analogous to spectral autoregression.
Showcasing an application that uses AI to transform World Cup videos into an ultra-cool anime style, from Instagram user leonrdewa.
Boogu-Image-0.1 is an Apache-2.0 open-source unified image generation and editing model family, including variants for text-to-image, fast generation, editing, and Chinese-English text rendering, released as a research project on Hugging Face.
Mistral AI launches a free plan offering its SOTA large model, 500 memory entries, image generation, project grouping, and 40+ enterprise connectors. Only phone number verification is needed.