HPRO: Hierarchical Progressive Reward Optimization for Emotional TTS
The authors propose HPRO, a hierarchical progressive reward optimization framework designed to enhance emotional expressiveness in LLM-based Text-to-Speech models while preserving linguistic intelligibility. This approach addresses structural mismatches in existing preference-driven methods by isolating content and emotion and bridging the gap between sparse rewards and dense generation.