ToxiREX: A Dataset on Toxic REasoning in ConteXt

Researchers introduce ToxiREX, a new multilingual dataset designed to capture and explain implicit, context-dependent toxicity within Reddit comment threads. The dataset utilizes a systematic toxic reasoning schema to provide structured annotations for comments related to major global events across six languages.

Includes 125,000 annotated training comments generated by an LLM and nearly 3,000 test comments annotated by native speakers.
Covers English, Arabic, Turkish, Spanish, German, and Dutch comments linked to specific events like the 2023 Turkey earthquakes and the Russian invasion of Ukraine.
Provides baseline results from prompting and fine-tuning models, demonstrating that while performance exceeds random chance, significant improvement is needed.

ToxiREX is the first dataset to simultaneously incorporate multiple languages, conversational context, and implicit toxicity using a toxic reasoning schema for rich, structured annotations.