Automatic Evaluation Metrics for Topic Modeling
Automatic evaluation metrics : topic coherence and diversity of the models.
Topic Coherence Measures
- NPMI (Lau et al., 2014)
- Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality (EACL , 2014)
- Normalized Pointwise Mutual Information
- WE (Fang et al., 2016)
- Word Embedding (WE)
- Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data (SIGIR , 2016)
the pairwise NPMI score and word embedding similarity, respectively, between the top-10 words of each topic.
Topic Diversity Measures
- C_v measure is based on a sliding window, one-set segmentation of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity
- Topic Uniqueness (TU) (Dieng et al., 2020)
- Topic Modeling in Embedding Spaces (TACL , 2020)
- Inversed Rank-Biased Overlap (I-RBO) (Terragni et al., 2021; Bianchi et al., 2021a),
- Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence (ACL 2021)
Top-Purity and Normalized Mutual Information(Top-NMI) as metrics(Nguyen et al., 2018)
to evaluate alignment. Both of them range from 0 to 1.
A higher score reflects better clustering performance.
- Nguyen et al., 2018 - Improving Topic Models with Latent Feature Word Representations
We further apply the KMeans algorithm to topic proportions z and use the clustered documents to report
purity(Km-Purity) and NMI Km-NMI (Zhao et al., 2020a)
-Zhao et al., 2020a - Neural Topic Model via Optimal Transport (ICLR , 2021)
- 참고문헌 : Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence (Neurips 2021)