An Empirical Analysis of Factual Errors in Human-Written Text and its Application
This study addresses the neglect of factual error detection in human-written text by distilling a taxonomy of errors from newspaper article corrections, revealing categories like kanji misconversions that are absent in current hallucination benchmarks. The authors evaluate vanilla large language models on synthesized test cases and real corrections to assess their performance on this specific task.