The article compares two methodological approaches for selecting representative samples of news articles about disasters: a top-down approach using existing disaster inventories to query databases, and a bottom-up approach using NLP methods to cluster texts based on temporal and spatial features.

  • The study utilizes a dataset of German news articles concerning landslides worldwide to evaluate these methods.
  • It discusses variations in event coverage resulting from the choice between querying by inventory versus clustering by text features.
  • The research design decision influences the resulting news sample, which affects its utility for studies on media coverage inequality, disaster monitoring, and inventory enrichment.