Managing Map Cardinality in Automatic Disease Classification Mapping

The article introduces a novel method for automatic mapping between disease classification systems, such as ICD-9-CM and ICD-10-CM, that addresses the limitations of existing embedding-based approaches which often overlook complex one-to-many scenarios. By employing a blocking-and-matching pipeline inspired by entity resolution, the authors utilize large language models to identify valid mappings within candidate blocks.

The method generates a block of candidate matches through blocking and uses an LLM for matching within each block.
It balances the inherent trade-offs between precision, recall, and mapping coverage found in threshold-based and top-K methods.
Empirical results show higher precision with comparable recall and broader coverage across ICD-9-CM↔ICD-10-CM and ICD-10-AM↔ICD-11 pairs.

This approach helps users integrate health data and conduct longitudinal analysis by providing more accurate and comprehensive mappings between different disease classification codes.