Tag
UR-BERT proposes a Romanized transcription-based text encoder for massively multilingual TTS, scaling to 495 languages by using universal Romanization and a speech token prediction objective to enhance phonetic alignment and generalization to unseen languages.