A novel framework named CDDTLDA uses transfer learning and data augmentation to address Chinese dialects discrimination with limited annotations. It trains a source ASR model on a large dialect corpus, applies speed, pitch, and noise augmentation to low-resource target dialects, and fine-tunes a target ASR model using self-attention to capture shared semantic features. Experimental results show CDDTLDA outperforms state-of-the-art methods on two benchmark Chinese dialect corpora.