DialogPII: A multilingual dataset of synthetic dialog transcripts to detect personal information
Researchers present DialogPII, a multilingual dataset of synthetic dialog transcripts designed to support the development and evaluation of automatic systems for detecting personally identifiable information. This resource addresses privacy concerns in sensitive domains by providing annotated data across 11 languages and eight interaction scenarios.