A computational analysis of the Complete Tang Poems shows that poets' geographic origins leave detectable linguistic traces. Models using character n-gram TF-IDF and domain features achieve 0.69 accuracy in predicting broad regional origin (South vs. North), surpassing chance, and correctly classify finer circuit-level origins. The study finds linguistic distance between circuits correlates with geographic distance, with regional divergence increasing in the Late Tang, and highlights historical biases in early Tang poetic style.
Linguistic Fingerprints Reveal Tang Poets' Regional Origins
from English