All articles
arxiv arXiv cs.CL · 5h ago

Managing Map Cardinality in Automatic Disease Classification Mapping

The article introduces a novel method for automatic mapping between disease classification systems, such as ICD-9-CM and ICD-10-CM, that addresses the limitations of existing embedding-based approaches which often overlook complex one-to-many scenarios. By employing a blocking-and-matching pipeline inspired by entity resolution, the authors utilize large language models to identify valid mappings within candidate blocks.

arxiv arXiv cs.CL · 6h ago

Systematic Benchmark of Lightweight Hallucination Detection Across QA, Dialogue, and Summarisation

This paper benchmarks five lightweight, CPU-feasible hallucination detection methods to provide practical alternatives for resource-constrained researchers who cannot use GPU-intensive or proprietary solutions. The study evaluates ROUGE-L, semantic similarity, BERTScore, a FEVER-trained DeBERTa NLI detector, and an ensemble of similarity and NLI across the HaluEval benchmark's question answering, dialogue, and summarisation tasks.

arxiv arXiv cs.CL · 6h ago

Revealing the Technology Development of Natural Language Processing: A Scientific Entity-Centric Perspective

This study analyzes the development of technologies in Natural Language Processing (NLP) from an entity-centric perspective, extracting methods, datasets, metrics, and tools to measure their impact via co-occurrence networks. The research reveals that while pre-trained language models like BERT and Transformer have become mainstream, the average number of entities per paper is increasing, indicating a growing knowledge burden for researchers.