
Automatic classification systems that are used in contextual advertising are growing more powerful and complex, yet they may struggle with context, nuance, and linking with human perception. By mentioning “Ibrahimovic, Messi, and Ronaldo”— should this article now be classified as Football? A machine might categorize an article mentioning an athlete’s name as a sports article without understanding the broader content, e.g. if that same athlete has started their own clothing line, or if their name is just being used to make a point in an AI blog post. This is just one of many examples that illustrate why AI-driven categorization—and AI evaluation itself—requires human oversight.
Simply trusting the modern AI systems to perform well because it performs well on older news is not adequate for us at Aeterna Labs. The world changes fast, with new politicians, athletes, and events quickly influencing society. Here, the work of Anton Eklund, an industrial PhD student working with Aeterna Labs, whose research on human-centered evaluation is transforming how we approach measuring and characterizing contexts. With efficient human-centered evaluation methods, we can keep up with the news cycle and validate classifiers for new content, making them reliable for displaying the content that they are meant to.
On April 3rd, Anton Eklund will defend his thesis, Evaluation of Document Clusters Through Human Interpretation, at Umeå University. This milestone underscores the importance of continuous research in AI evaluation and the growing recognition of human-AI collaboration in machine learning.
The key take-aways from the thesis related to contextual advertising according to Anton:
- Empirical evidence for modern language models actually grouping semantically similar articles depending on their content. They will for example not group the article you are reading right now together with other Football news articles.
- It is crucial to characterize contexts further than just the main theme. E.g. the usually positive Music/Concerts classifier had an unexpected darker undertone when the news was dominated by a terror threat that forced Taylor Swift to cancel a concert.
You can read more about the research in the University news: https://www.umu.se/en/news/ciphepeoples-interpretations-central-in-new-framework-for-evaluating-ai_12068551/