This paper introduces the PROTECT-90 dataset, an open electromagnetic transient (EMT)-simulated reference benchmark designed to address the lack of standardized, publicly available high-voltage waveform datasets for power system protection. The release aims to enable transparent and reproducible evaluation of data-driven methods through consistent digital-fault-recorder-like measurements.
- The dataset comprises 9,022 physically consistent short-circuit simulation episodes generated on a standardized 90 kV double-line topology.
- It includes systematically documented domain randomization of grid operating points, line parameters, and fault conditions.
- Synchronized three-phase voltage and current waveforms are recorded at eight measurement locations for each episode.
- Structured, machine-readable metadata describing fault type, location, inception time, and operating conditions is provided.
- All modeling assumptions, parameter ranges, and data-generation procedures are explicitly documented to ensure transparency.
PROTECT-90 establishes a standardized foundation for the reproducible benchmarking of protection-oriented signal processing and learning-based methods by combining physically grounded simulation with open accessibility.