Tag
This paper investigates speech-driven features for fine-grained discrimination among Chinese dialects, using an end-to-end model that combines MFCC-based features with word-level embeddings via a CNN, outperforming text-driven methods.