Study Finds AI Still Fails to Detect Legal Citation Hallucinations

A new study reveals over 1,000 legal filings contain fabricated citations, with the number rising annually. Benchmarking five AI models shows improved performance, with GPT-5 achieving 82.8% recall and 60.5% F1 in agentic settings, though all models struggle with subtle errors and face resource constraints due to limited information access.