SOHET introduces a hierarchical transformer architecture with event-type-specific tabular encoders and self-supervised pre-training objectives. It outperforms existing methods by 5.8% on Booking.com's fraud detection task and achieves faster convergence with 2.4% additional gain from pre-training. On the EBES benchmark, bidirectional SOHET matches or exceeds the best published results on six out of eight tasks.