STAR introduces a spatio-temporal reward allocation method for text-to-image generation, using attention maps to dynamically assign advantages across denoising steps. It improves semantic alignment, text rendering, and preference optimization in Stable Diffusion 3.5 Medium, achieving 0.9759, 0.9757, and 23.60 on GenEval, OCR, and PickScore respectively.