Lightricks/LTX-2.3

Hugging Face Models Trending 2026/03/04 22:28 模型

open-source video-generation audio-generation foundation-model diffusion-model lightricks

摘要

Lightricks 发布了 LTX-2.3，这是一个基于扩散的开放权重音视频基础模型，具有改进的质量和提示遵循性，提供多个检查点，包括蒸馏和 LoRA 变体，可在本地执行。

Task: image-to-video Tags: diffusers, image-to-video, text-to-video, video-to-video, image-text-to-video, audio-to-video, text-to-audio, video-to-audio, audio-to-audio, text-to-audio-video, image-to-audio-video, image-text-to-audio-video, ltx-2, ltx-2-3, ltx-video, ltxv, lightricks, en, de, es, fr, ja, ko, zh, it, pt, arxiv:2601.03233, license:other, region:us

查看原文

查看缓存全文

缓存时间: 2026/05/08 18:28

Lightricks/LTX-2.3 · Hugging Face

Source: https://huggingface.co/Lightricks/LTX-2.3

https://huggingface.co/Lightricks/LTX-2.3#ltx-23-model-cardLTX-2.3 模型卡片

本模型卡片主要介绍 LTX-2.3 模型，该模型是 LTX-2 模型（https://huggingface.co/Lightricks/LTX-2）的重大更新，提升了音频和视觉质量，并增强了提示遵循能力。LTX-2 曾在论文 LTX-2: Efficient Joint Audio-Visual Foundation Model（https://huggingface.co/papers/2601.03233）中提出。

💻💻想直接上手代码？代码已在此处提供（https://github.com/Lightricks/LTX-2）。💾💾

LTX-2.3 是一个基于 DiT 的音频-视频基础模型，能够在单一模型中生成同步的视频和音频。它融合了现代视频生成的核心模块，采用开放权重，并注重实用的本地执行能力。

LTX-2 开源（https://youtu.be/o-7us-BR_gQ）

https://huggingface.co/Lightricks/LTX-2.3#model-checkpoints模型检查点

名称说明ltx-2.3-22b-dev完整模型，支持 bf16 灵活训练ltx-2.3-22b-distilled蒸馏版完整模型，8 步，CFG=1ltx-2.3-22b-distilled-1.1蒸馏版 v1.1 完整模型，8 步，CFG=1——相比 v1.0，提供不同的美学体验并改进音频ltx-2.3-22b-distilled-lora-384蒸馏模型的 LoRA 版本，适用于完整模型ltx-2.3-22b-distilled-lora-384-1.1v1.1 蒸馏模型的 LoRA 版本，适用于完整模型ltx-2.3-spatial-upscaler-x2-1.1用于 ltx-2.3 潜变量的 x2 空间升频器，在多阶段（多尺度）流程中实现更高分辨率ltx-2.3-spatial-upscaler-x1.5-1.0用于 ltx-2.3 潜变量的 x1.5 空间升频器，在多阶段（多尺度）流程中实现更高分辨率ltx-2.3-temporal-upscaler-x2-1.0用于 ltx-2.3 潜变量的 x2 时间升频器，在多阶段（多尺度）流程中实现更高 FPS

https://huggingface.co/Lightricks/LTX-2.3#model-details模型详情

**开发者：**Lightricks
**模型类型：**基于扩散的音频-视频基础模型
**支持的语种：**英语

https://huggingface.co/Lightricks/LTX-2.3#online-demo在线演示

LTX-2.3 现可通过 API 游乐场（https://console.ltx.video/playground/）直接使用。

https://huggingface.co/Lightricks/LTX-2.3#run-locally本地运行

https://huggingface.co/Lightricks/LTX-2.3#direct-use-license直接使用许可

你可以使用这些模型（完整版、蒸馏版、升频器以及任何衍生模型）用于许可协议（https://huggingface.co/Lightricks/LTX-2.3/blob/main/LICENSE）规定的目的。

https://huggingface.co/Lightricks/LTX-2.3#comfyuiComfyUI

我们建议使用 ComfyUI Manager 中内置的 LTXVideo 节点。如需手动安装详情，请参考我们的文档站点（https://docs.ltx.video/open-source-model/integration-tools/comfy-ui）。

https://huggingface.co/Lightricks/LTX-2.3#pytorch-codebasePyTorch 代码库

LTX-2 代码库（https://github.com/Lightricks/LTX-2）是一个包含多个包的单一仓库（monorepo），从 ‘ltx-core’ 中的模型定义，到 ‘ltx-pipelines’ 中的流水线，以及 ‘ltx-trainer’ 中的训练能力。该代码库已在 Python >=3.12、CUDA 版本 >12.7 下测试，并支持 PyTorch ~= 2.7。

https://huggingface.co/Lightricks/LTX-2.3#installation安装

git clone https://github.com/Lightricks/LTX-2.git
cd LTX-2

# 从仓库根目录
uv sync
source .venv/bin/activate

https://huggingface.co/Lightricks/LTX-2.3#inference推理

要使用我们的模型，请按照 ltx-pipelines（https://github.com/Lightricks/LTX-2/blob/main/packages/ltx-pipelines/README.md）包中的说明操作。

https://huggingface.co/Lightricks/LTX-2.3#diffusers-%F0%9F%A7%A8Diffusers 🧨

Diffusers Python 库（https://huggingface.co/docs/diffusers/main/en/index）对 LTX-2.3 的支持即将推出！

https://huggingface.co/Lightricks/LTX-2.3#general-tips通用提示：

宽度和高度设置必须能被 32 整除。帧数必须能被 8 + 1 整除。
如果分辨率或帧数无法被 32 或 8+1 整除，输入应填充 -1，然后裁剪至所需分辨率和帧数。
关于编写有效提示的技巧，请访问我们的提示指南（https://ltx.video/blog/how-to-prompt-for-ltx-2）。

https://huggingface.co/Lightricks/LTX-2.3#limitations局限性

该模型不适用于提供事实信息，也无法做到这一点。
作为统计模型，该检查点可能会放大现有的社会偏见。
模型可能无法生成与提示完全匹配的视频。
提示遵循能力受提示风格影响较大。
模型可能生成不当或冒犯性内容。
在生成不含语音的音频时，音频质量可能较低。

https://huggingface.co/Lightricks/LTX-2.3#train-the-model训练模型

基础（dev）模型完全可训练。

按照 LTX-2 Trainer 自述文件（https://github.com/Lightricks/LTX-2/blob/main/packages/ltx-trainer/README.md）中的说明，即可非常轻松地复现我们随模型发布的 LoRA 和 IC-LoRA。

在许多设置下，针对运动、风格或相似度（声音+外观）的训练时间可少于 1 小时。

https://huggingface.co/Lightricks/LTX-2.3#citation引用

@article{hacohen2025ltx2,
  title={LTX-2: Efficient Joint Audio-Visual Foundation Model},
  author={HaCohen, Yoav and Brazowski, Benny and Chiprut, Nisan and Bitterman, Yaki and Kvochko, Andrew and Berkowitz, Avishai and Shalem, Daniel and Lifschitz, Daphna and Moshe, Dudu and Porat, Eitan and Richardson, Eitan and Guy Shiran and Itay Chachy and Jonathan Chetboun and Michael Finkelson and Michael Kupchick and Nir Zabari and Nitzan Guetta and Noa Kotler and Ofir Bibi and Ori Gordon and Poriya Panet and Roi Benita and Shahar Armon and Victor Kulikov and Yaron Inger and Yonatan Shiftan and Zeev Melumian and Zeev Farbman},
  journal={arXiv preprint arXiv:2601.03233},
  year={2025}
}