Koshur Pixel introduces a synthetic OCR dataset with 613,078 image-text pairs generated from the KS-PRET-5M corpus using SynthOCR-Gen. It includes over 25 augmentation strategies and spans diverse fonts and textual scales, from words to full-page documents, enabling scalable training for Kashmiri OCR systems.