AUTOMATIC1111/stable-diffusion-webui

GitHub Trending (daily) 工具

stable-diffusion web-ui gradio text-to-image open-source image-generation

摘要

该开源项目为 Stable Diffusion 提供了一个功能丰富的 Web 界面，使用户能够借助各种 AI 模型和扩展轻松生成、编辑和放大图像。项目基于 Gradio 构建，支持 txt2img、img2img、inpainting 以及众多由社区驱动的本地 AI 图像生成工具。

Stable Diffusion Web UI

查看原文导出为 Word 导出为 PDF

查看缓存全文

缓存时间: 2026/05/11 12:37

AUTOMATIC1111/stable-diffusion-webui

来源：https://github.com/AUTOMATIC1111/stable-diffusion-webui

Stable Diffusion web UI

基于 Gradio 库实现的 Stable Diffusion Web 界面。

功能特性

附带图片的详细功能展示 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features)：

原生文生图（txt2img）与图生图（img2img）模式
一键安装与运行脚本（但仍需预先安装 Python 和 Git）
向外绘制（Outpainting）
局部重绘（Inpainting）
彩色草图（Color Sketch）
提示词矩阵（Prompt Matrix）
Stable Diffusion 放大
注意力控制（Attention），指定模型应更关注的文本部分
- a man in a ((tuxedo)) - 将更加关注 tuxedo（燕尾服）
- a man in a (tuxedo:1.21) - 替代语法
- 选中文本并按下 Ctrl+Up 或 Ctrl+Down（MacOS 用户为 Command+Up 或 Command+Down）即可自动调整所选文本的注意力权重（代码由匿名用户提供）
回环处理（Loopback），多次运行图生图处理
X/Y/Z 图表，一种绘制不同参数下图像的三维对比图表的方法
文本反转（Textual Inversion）
- 支持任意数量的 embedding，并可自定义名称
- 支持使用每个 token 向量数不同的多个 embedding
- 支持半精度浮点数运行
- 可在 8GB 显存下训练 embedding（亦有 6GB 显存成功的报告）
附加功能（Extras）选项卡包含：
- GFPGAN，用于修复面部的神经网络
- CodeFormer，作为 GFPGAN 替代方案的面部修复工具
- RealESRGAN，神经网络放大算法
- ESRGAN，支持大量第三方模型的神经网络放大算法
- SwinIR 与 Swin2SR（详见此处 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/2092)），神经网络放大算法
- LDSR，潜在扩散超分辨率放大
调整尺寸与宽高比选项
采样方法选择
- 调整采样器 eta 值（噪声乘数）
- 更多高级噪声设置选项
随时中断处理过程
支持 4GB 显存显卡（亦有 2GB 显存成功的报告）
批量生成时提供正确的随机种子（seed）
实时提示词 token 长度验证
生成参数
- 生成图像所用的参数会随图像一同保存
- PNG 格式保存在 PNG chunks 中，JPEG 格式保存在 EXIF 中
- 可将图片拖拽至 PNG 信息（PNG info）选项卡以还原生成参数，并自动复制到 UI 中
- 可在设置中禁用此功能
- 支持将图片/文本参数拖拽至提示词输入框
读取生成参数按钮，将提示词框中的参数加载至 UI
设置页面
支持从 UI 运行任意 Python 代码（需添加 --allow-code 启动参数以启用）
大多数 UI 元素提供鼠标悬停提示
可通过文本配置文件更改 UI 元素的默认值/最小值/最大值/步长
平铺（Tiling）支持，勾选后可生成如纹理般可无缝平铺的图像
进度条与实时图像生成预览
- 可使用独立的神经网络生成预览，几乎不消耗显存或算力
反向提示词（Negative prompt），额外的文本框用于列出你不希望在生成图像中出现的内容
样式（Styles），保存部分提示词以便后续通过下拉菜单轻松应用
变体（Variations），生成构图相同但存在细微差异的图像
种子缩放（Seed resizing），在略微不同的分辨率下生成相同构图的图像
CLIP 反推（CLIP interrogator），尝试从图像反推提示词的按钮
提示词编辑（Prompt Editing），支持在生成中途更改提示词，例如开头生成西瓜，中途切换为动漫女孩
批量处理（Batch Processing），使用图生图处理一组文件
图生图替代模式（Img2img Alternative），基于交叉注意力控制的反向 Euler 方法
高分辨率修复（Highres Fix），一键生成高分辨率图像且避免常见畸变的便捷选项
支持热重载检查点（checkpoints）
检查点合并器（Checkpoint Merger），允许将最多 3 个检查点合并为一个的选项卡
自定义脚本 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Custom-Scripts)，包含大量社区扩展
可组合扩散（Composable-Diffusion）(https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/)，支持同时使用多个提示词
- 使用大写的 AND 分隔提示词
- 同时支持提示词权重：a cat :1.2 AND a dog AND a penguin :2.2
提示词无 token 数量限制（原版 Stable Diffusion 仅限 75 个 token）
集成 DeepDanbooru，为动漫提示词生成 Danbooru 风格标签
xformers (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Xformers)，为特定显卡带来显著速度提升：（在命令行参数中添加 --xformers）
通过扩展：历史记录选项卡 (https://github.com/yfszzx/stable-diffusion-webui-images-browser)：在 UI 内便捷地查看、管理与删除图像
无限生成选项
训练选项卡
- hypernetwork 与 embedding 选项
- 图像预处理：裁剪、镜像、使用 BLIP 或 deepdanbooru（针对动漫）自动打标签
Clip skip（CLIP 跳过层数）
Hypernetworks（超网络）
LoRAs（与 Hypernetworks 类似但效果更佳）
独立的 UI 界面，支持预览并选择要添加到提示词中的 embedding、hypernetwork 或 LoRA
可在设置界面选择加载不同的 VAE
进度条显示预计完成时间
API 接口
支持 RunwayML 的专用局部重绘模型 (https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion)
通过扩展：美学梯度（Aesthetic Gradients）(https://github.com/AUTOMATIC1111/stable-diffusion-webui-aesthetic-gradients)，通过使用 CLIP 图像嵌入生成具有特定美学风格的图像（基于 https://github.com/vicgalle/stable-diffusion-aesthetic-gradients 实现）
支持 Stable Diffusion 2.0 (https://github.com/Stability-AI/stablediffusion) - 使用说明见 wiki (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#stable-diffusion-20)
支持 Alt-Diffusion (https://arxiv.org/abs/2211.06679) - 使用说明见 wiki (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#alt-diffusion)
现已移除所有敏感词过滤！
支持加载 safetensors 格式的检查点
放宽分辨率限制：生成图像的宽高只需为 8 的倍数，而非 64
现已添加开源许可证！
支持通过设置界面重新排列 UI 元素顺序
支持 Segmind Stable Diffusion (https://huggingface.co/segmind/SSD-1B)

安装与运行

请确保满足所需的依赖项 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Dependencies)，并按照以下对应平台的说明进行操作：

NVidia (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-NVidia-GPUs)（推荐）
AMD (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs) 显卡。
Intel CPU、Intel 显卡（核显与独显）(https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon)（外部 wiki 页面）
昇腾 NPU (https://github.com/wangshuai09/stable-diffusion-webui/wiki/Install-and-run-on-Ascend-NPUs)（外部 wiki 页面）

或者，使用在线服务（如 Google Colab）：

在线服务列表 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Online-Services)

使用发布包在 Windows 10/11（NVidia 显卡）上安装

从 v1.0.0-pre (https://github.com/AUTOMATIC1111/stable-diffusion-webui/releases/tag/v1.0.0-pre) 下载 sd.webui.zip 并解压。
运行 update.bat。
运行 run.bat。

更多详情请参阅 Install-and-Run-on-NVidia-GPUs (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-NVidia-GPUs)

Windows 自动安装

安装 Python 3.10.6 (https://www.python.org/downloads/release/python-3106/)（更新版本的 Python 不支持 torch），并勾选 “Add Python to PATH”。
安装 Git (https://git-scm.com/download/win)。
下载 stable-diffusion-webui 仓库，例如运行 git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git。
在 Windows 资源管理器中以普通用户（非管理员）身份运行 webui-user.bat。

Linux 自动安装

安装依赖项：

# Debian-based:
sudo apt install wget git python3 python3-venv libgl1 libglib2.0-0
# Red Hat-based:
sudo dnf install wget git python3 gperftools-libs libglvnd-glx
# openSUSE-based:
sudo zypper install wget git python3 libtcmalloc4 libglvnd
# Arch-based:
sudo pacman -S wget git python3

如果你的系统非常新，可能需要安装 python3.11 或 python3.10：

# Ubuntu 24.04
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.11

# Manjaro/Arch
sudo pacman -S yay
yay -S python311 # 请勿与 python3.11 包混淆

# 仅针对 3.11
# 然后在启动脚本中设置环境变量
export python_cmd="python3.11"
# 或在 webui-user.sh 中设置
python_cmd="python3.11"

进入你希望安装 webui 的目录，并执行以下命令：

wget -q https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh

或者直接在任意位置克隆仓库：

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

运行 webui.sh。
查看 webui-user.sh 以配置启动选项。

Apple Silicon 安装

相关说明请见此处 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Installation-on-Apple-Silicon)。

贡献指南

如何向本仓库提交代码：贡献指南 (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Contributing)

文档

文档已从本 README 迁移至项目的 wiki (https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki)。

为了方便 Google 等搜索引擎抓取 wiki 内容，此处提供可供抓取的 wiki 链接（非人类阅读友好）(https://github-wiki-see.page/m/AUTOMATIC1111/stable-diffusion-webui/wiki)。

致谢

所借用代码的许可证可在 设置 -> 许可证 界面查看，也可参阅 html/licenses.html 文件。

Stable Diffusion - https://github.com/Stability-AI/stablediffusion, https://github.com/CompVis/taming-transformers, https://github.com/mcmonkey4eva/sd3-ref
k-diffusion - https://github.com/crowsonkb/k-diffusion.git
Spandrel - https://github.com/chaiNNer-org/spandrel 实现
- GFPGAN - https://github.com/TencentARC/GFPGAN.git
- CodeFormer - https://github.com/sczhou/CodeFormer
- ESRGAN - https://github.com/xinntao/ESRGAN
- SwinIR - https://github.com/JingyunLiang/SwinIR
- Swin2SR - https://github.com/mv-lab/swin2sr
LDSR - https://github.com/Hafiidz/latent-diffusion
MiDaS - https://github.com/isl-org/MiDaS
优化思路 - https://github.com/basujindal/stable-diffusion
交叉注意力层优化 - Doggettx - https://github.com/Doggettx/stable-diffusion，提示词编辑的原始构思。
交叉注意力层优化 - InvokeAI, lstein - https://github.com/invoke-ai/InvokeAI (原地址 http://github.com/lstein/stable-diffusion)
次二次方交叉注意力层优化 - Alex Birch (https://github.com/Birch-san/diffusers/pull/1), Amin Rezaei (https://github.com/AminRezaei0x443/memory-efficient-attention)
文本反转（Textual Inversion）- Rinon Gal - https://github.com/rinongal/textual_inversion（未直接使用其代码，但借鉴了其思路）。
SD 放大算法构思 - https://github.com/jquesnelle/txt2imghd
向外绘制 mk2 的噪声生成 - https://github.com/parlance-zz/g-diffuser-bot
CLIP 反推构思及部分代码借鉴 - https://github.com/pharmapsychotic/clip-interrogator
可组合扩散（Composable Diffusion）构思 - https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
xformers - https://github.com/facebookresearch/xformers
DeepDanbooru - 动漫 diffusers 反推工具 https://github.com/KichangKim/DeepDanbooru
从 float16 UNet 进行 float32 精度采样 - 感谢 marunine 提供思路，Birch-san 提供 Diffusers 示例实现 (https://github.com/Birch-san/diffusers-play/tree/92feee6)
Instruct pix2pix - Tim Brooks (star), Aleksander Holynski (star), Alexei A. Efros (no star) - https://github.com/timothybrooks/instruct-pix2pix
安全建议 - RyotaK
UniPC 采样器 - Wenliang Zhao - https://github.com/wl-zhao/UniPC
TAESD - Ollin Boer Bohan - https://github.com/madebyollin/taesd
LyCORIS - KohakuBlueleaf
重启采样（Restart sampling）- lambertae - https://github.com/Newbeeer/diffusion_restart_sampling
Hypertile - tfernd - https://github.com/tfernd/HyperTile
初始 Gradio 脚本 - 由匿名用户发布于 4chan。感谢这位匿名用户。
（你）

AUTOMATIC1111/stable-diffusion-webui

AUTOMATIC1111/stable-diffusion-webui

Stable Diffusion web UI

功能特性

安装与运行

使用发布包在 Windows 10/11（NVidia 显卡）上安装

Windows 自动安装

Linux 自动安装

Apple Silicon 安装

贡献指南

文档

致谢

相似文章

最强本地AI图像生成器来了！

这就是 ChatGPT Images 2.0

aisha-ai-official/animagine-xl-v4-opt

NucleusAI/Nucleus-Image

prunaai/z-image-turbo

提交意见反馈