Each point in the graph represents one article. The closer the point is to another, the more the articles have in common.
News comes in various flavours and forms. When attempting to monitor what is being written about in the world, it can be difficult to see beyond your own interests and filter bubble. To reduce the impact of our own bias when creating and evaluating classifiers, we are avid users of the NLP-technique Topic Modelling.
Topic modelling takes an unstructured group of articles, analyses them, and outputs the underlying topics. We can apply this on the news stream to pick up content we would otherwise miss. This is particularly useful in the ever-changing news domain with stories evolving over time and new trends emerging and fading.
By comparing our classification results with the topic modelling result we can find trending news and create new contextual categories.
Our research in topic modelling and news clustering has been published at EMNLP 2022 and other venues, and we are committed to continuing research in this area, specifically addressing industry needs.
Check out the 3D-graph of English news topics here!
Anton Eklund
Ph.D. Candidate at Aeterna