LuxEmo: 21-hour expressive Luxembourgish TTS corpus

The authors introduce LuxEmo, a 21-hour conversational expressive speech corpus for the low-resource language Luxembourgish, featuring four emotion categories.

The dataset is derived from Radio Télévision Luxembourg (RTL) youth broadcasts.
Curation uses a semi-automatic workflow with voice activity detection, denoising, language identification, LuxASR-based segmentation, and automatic emotion prediction.
Five expressive TTS systems are benchmarked, covering German-based cross-lingual transfer, multilingual support, adaptation, and non-parametric prosody transfer.

The work addresses the underrepresentation of Luxembourgish in speech technology research by providing a validated dataset for expressive text-to-speech development.