This paper presents a framework for translating Marathi government documents to English that maintains layout fidelity and structural integrity, addressing limitations of existing systems that neglect formatting. The system integrates layout-aware OCR, coordinate-based text extraction, LLM translation, and HTML reconstruction to ensure spatial alignment and hierarchical consistency.

  • Integrates layout-aware optical character recognition and coordinate-based text extraction for precise text handling.
  • Utilizes large language models for translation while enforcing spatial alignment constraints.
  • Reconstructs documents through HTML representations to preserve hierarchical elements and layout.
  • Demonstrated improved structural preservation, translation coherence, and terminological consistency on real-world Marathi government PDFs compared to conventional pipelines.

The framework contributes toward scalable multilingual accessibility solutions for e-governance and administrative document processing by enabling end-to-end document transformation.