This paper introduces the PROTECT-90 dataset, an open electromagnetic transient (EMT)-simulated reference benchmark designed to address the lack of standardized, publicly available high-voltage waveform datasets for power system protection. The release aims to enable transparent and reproducible evaluation of data-driven methods through consistent digital-fault-recorder-like measurements.

  • The dataset comprises 9,022 physically consistent short-circuit simulation episodes generated on a standardized 90 kV double-line topology.
  • It includes systematically documented domain randomization of grid operating points, line parameters, and fault conditions.
  • Synchronized three-phase voltage and current waveforms are recorded at eight measurement locations for each episode.
  • Structured, machine-readable metadata describing fault type, location, inception time, and operating conditions is provided.
  • All modeling assumptions, parameter ranges, and data-generation procedures are explicitly documented to ensure transparency.

PROTECT-90 establishes a standardized foundation for the reproducible benchmarking of protection-oriented signal processing and learning-based methods by combining physically grounded simulation with open accessibility.