text-encoder

#text-encoder

UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction

arXiv cs.CL ↗ · 2026-06-11 Cached

UR-BERT proposes a Romanized transcription-based text encoder for massively multilingual TTS, scaling to 495 languages by using universal Romanization and a speech token prediction objective to enhance phonetic alignment and generalization to unseen languages.

0 favorites 0 likes

#text-encoder

Comfy-Org/Ideogram-4

Hugging Face Models Trending ↗ · 2026-06-03 Cached

Ideogram-4 model repackaged for ComfyUI, including fp8 scaled diffusion models, Qwen3VL text encoder, and FLUX VAE.

0 favorites 0 likes

#text-encoder

Text-to-Image Models Need Less from Text Encoders Than You Think

Hugging Face Daily Papers ↗ · 2026-06-02 Cached

This paper demonstrates that text-to-image diffusion transformer models primarily rely on token merging and word order from text encoders rather than full contextual embeddings, suggesting that the image model itself decodes complex linguistic structures.

0 favorites 0 likes

#text-encoder

Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper investigates how semantic information is distributed across textual tokens in text-to-image models, finding that information concentration and cross-item interactions significantly affect image generation alignment. The authors use patching techniques to demonstrate that simple encoding-stage interventions can improve alignment quality.

0 favorites 0 likes

text-encoder

UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction

Comfy-Org/Ideogram-4

Text-to-Image Models Need Less from Text Encoders Than You Think

Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models

Submit Feedback