Self-supervised speech models lack tonal context compensation
The wav2vec2.0 model shows no evidence of perceptual compensation for Mandarin tones in embedding similarities. Probing classifiers reveal limited compensation and fail to match human performance on isolated syllables, suggesting supervised training is needed for phonological regularity abstraction.