The article argues that natural language processing infrastructure for the billion-plus speakers of Indic languages is fragmented due to a lack of shared structural foundations. It proposes leveraging the morphosyntactic architecture formalized in Pānini's Astādhyāyī as a unifying computational framework to improve accuracy and data efficiency.

  • The current field organizes tools around individual languages, overlooking the deep regularity shared across Indic languages through Sanskrit convergence.
  • A Pāninian framework can merge disparate resources into a single high-resource metalanguage bedrock.
  • The authors propose a four-part benchmark suite to render this shared architecture explicit and measurable.
  • The research raises questions about whether neural models trained on these languages independently represent Pānini's categories.

This approach aims to make Indic language systems more transferable and data-efficient by providing a unified computational architecture that the field has previously lacked.