Tag
VITA-QinYu is an expressive end-to-end spoken language model capable of role-playing and singing, trained on 15.8K hours of data to outperform peers in expressiveness and conversational accuracy.