Benchmarking collections of scientific journals
- Voice Interfaces / Natural lang. processing
The authors proposed a new technique for pairwise comparison of collections of scientific articles using the topic model.
The developed methodology is called Comparative Topic Analysis (CTA).
CTA allows you to get not only a quantitative assessment of the similarity of collections, but also the structural differences of the compared collections, both in quantitative form and with the help of visualization tools developed by the authors.
This study compares existing approaches to topic modelling in application to the task of comparing collections of scientific papers.
Probabilistic and generative topic models are considered.
Target audience: Applied Data Scientists.
Engineer with 20 years of experience in IT companies. Graduated from the Moscow Engineering physics Institute, faculty of Theoretical and Experimental physics, candidate of technical Sciences.