Researchers introduce DyadEE, a dataset for detecting emotional entrainment in dyadic speech, and propose TRACE, a window-level framework that models these interactions as ordered sequences of acoustic embeddings. The study demonstrates that incorporating conversational context and relationship information significantly improves detection accuracy.

  • DyadEE dataset contains both emotionally entrained conversations and synthetic interactions with disrupted entrainment via partner swapping and emotion resynthesis.
  • TRACE treats each sample as an interaction trace using emotion fine-tuned Whisper representations rather than pooled utterances.
  • The model achieves a best accuracy of 97.01% on the DyadEE dataset by leveraging temporal relationship-aware modeling.