This study analyzes the development of technologies in Natural Language Processing (NLP) from an entity-centric perspective, extracting methods, datasets, metrics, and tools to measure their impact via co-occurrence networks. The research reveals that while pre-trained language models like BERT and Transformer have become mainstream, the average number of entities per paper is increasing, indicating a growing knowledge burden for researchers.
- The study extracts technology-related entities from NLP articles and normalizes them using a semi-automatic approach to calculate z-scores based on co-occurrence networks.
- Methods dominate among the 179 high-impact entities identified, with pre-trained language models such as BERT and Transformer becoming mainstream in recent years.
- Unlike other method entities, the impact of the Wikipedia dataset and BLEU metric has continued to rise over the long term.
- There is a remarkable surge in the popularity of new high-impact technologies, with their acceptance by researchers accelerating at an unprecedented speed.
This approach provides a more accurate analysis of technology development trends than coarse-grained thematic perspectives, highlighting how pre-trained models have injected new vitality into NLP innovation.