TenStrip/LTX2.3-10Eros

Hugging Face Models Trending 04/29/26, 05:08 PM Models

ai-video image-to-video ltx-model fine-tuning comfyui hugging-face

Summary

This article introduces TenStrip/LTX2.3-10Eros, a fine-tuned AI video model on Hugging Face designed for improved image-to-video generation and prompt adherence. It provides technical details on file formats, compatibility with ComfyUI nodes, and specific prompting strategies for optimal results.

Task: image-to-video Tags: diffusers, image-to-video, region:us

Original Article

View Cached Full Text

Cached at: 05/08/26, 08:54 AM

TenStrip/LTX2.3-10Eros · Hugging Face

Source: https://huggingface.co/TenStrip/LTX2.3-10Eros 10 Eros

https://huggingface.co/TenStrip/LTX2.3-10Eros_Workflows

Nodes:https://github.com/TenStrip/10S-Comfy-nodes

Reliant onhttps://huggingface.co/SulphurAI/Sulphur-2-baseThis is a different merge attempt for ideal I2V use. It uses layer scaled merges of different steps, it’s not a straight weight merge. It behaves much nicer than lora load and respects prompt. Prompt should be enhanced, LTX has very little self reasoning and input when it is conditioned, first frame and all following motions, evolutions, and audio must be commanded-you will get nothing if you don’t ask it.

BF16 loads as a checkpoint with clip and VAEs.

Fp8_mixed_learned is the better FP8 version and is a full checkpoint as well, quant by S1LV3RC01N.

Kijai split files are for 10Eros FP8 Transformer version, but it has a different structure and variance. That one goes inside diffusion_models:https://huggingface.co/Kijai/LTX2.3_comfy/tree/main

!!! Larger distilled Loras will harm the model’s fine tune, try the cond_safe ones:https://huggingface.co/TenStrip/LTX2.3_Distilled_Lora_1.1_Experiments/tree/main

For prompt enhancement, try this foreword in Grok or Uncensored LLM:

Generate a video scene script with a description based on the attached image for an LLM that has a tokenizer that uses interleaved attention to support long-context understanding that is fed into a multimodal video model. Strict specification, follow up to the word: No timestamps. No unnecessary embellishment. Output only plain English text and make it a copy box.

First, describe the image initial scene in concise natural language; subject(s), subject(s) appearance, subject(s) composition and pose, background, and context.

Next, formulate a naturally evolving scenario that would take place describing every moving body part, composition change, and manipulation from the uploaded initial frame that would be reflected in the video models post-latent evolution output. If the image is explicit or sexual in nature, use full anatomical terminology and spice it up slightly with visually representable erotic themes.

Center the prompt around this basic idea: [ concept ]

interweave this dialogue or sound concept into the scene with descriptions of voice tone followed by the lines delivered in quotations, in a temporal sequence between or during motions. Dialogue should be concise and non-rambling as it will take away from video quality: [ dialogue ]

Inside that prompt describe only notable audio and audio queues, both normal and explicit; background noise as well as foley and natural sounds. In a temporal sequence paired with coinciding motions. In the case of absent dialogue or soundscapes and only if background music is fitting; describe a fitting genre and melodic tone with matching mood.

Output only text following above instruction. Follow-up suggestions should be on the topic of expanding or changing motion or dialogue from the output text.

TenStrip/LTX2.3-10Eros

TenStrip/LTX2.3-10Eros · Hugging Face

Similar Articles

RuneXX/LTX-2.3-Workflows

Lightricks/LTX-2.3-22b-IC-LoRA-LipDub

Lightricks/LTX-2

LTX-2: Efficient Joint Audio-Visual Foundation Model

nvidia/Cosmos3-Super-Image2Video

Submit Feedback

Similar Articles

Lightricks/LTX-2.3-22b-IC-LoRA-LipDub

LTX-2: Efficient Joint Audio-Visual Foundation Model

nvidia/Cosmos3-Super-Image2Video