This article introduces a synthetic multimodal framework designed to replicate First Notice of Loss (FNOL) conditions for insurance fraud detection, addressing the limitations of existing text-only approaches. The system generates agent-customer dialogue transcripts and two-speaker audios to integrate linguistic, behavioral, and speaker-based indicators.
- Generates synthetic agent-customer dialogue transcripts and two-speaker audios to replicate FNOL scenarios.
- Performs Automatic Speech Recognition (ASR) and diarisation on the generated audio data.
- Combines NER, regex-based feature extraction, LLM-RAG retrieval, and speaker embeddings in a rule-based risk score.
- Flags narrative reuse, structural inconsistencies, and cross-case voice repetition while balancing sensitivity and false positives.
The framework offers a reproducible baseline for fraud detection that extends beyond text-only methods, with dataset validation demonstrating stability and transfer potential.