Dialogue to Detection: A Multimodal Hybrid NLP Pipeline for Insurance Fraud Detection

This article introduces a synthetic multimodal framework designed to replicate First Notice of Loss (FNOL) conditions for insurance fraud detection, addressing the limitations of existing text-only approaches. The system generates agent-customer dialogue transcripts and two-speaker audios to integrate linguistic, behavioral, and speaker-based indicators.

Generates synthetic agent-customer dialogue transcripts and two-speaker audios to replicate FNOL scenarios.
Performs Automatic Speech Recognition (ASR) and diarisation on the generated audio data.
Combines NER, regex-based feature extraction, LLM-RAG retrieval, and speaker embeddings in a rule-based risk score.
Flags narrative reuse, structural inconsistencies, and cross-case voice repetition while balancing sensitivity and false positives.

The framework offers a reproducible baseline for fraud detection that extends beyond text-only methods, with dataset validation demonstrating stability and transfer potential.