Automatic Evaluation Metrics for Topic Modeling

2024. 1. 10. 16:14·Study/Topic Modeling

 

 

Automatic evaluation metrics : topic coherence and diversity of the models.

 

 

 Topic Coherence Measures 

  • NPMI (Lau et al., 2014) 

- Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality (EACL , 2014)

- Normalized Pointwise Mutual Information

 

  • WE (Fang et al., 2016)

- Word Embedding (WE) 

- Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data (SIGIR , 2016)

 

the pairwise NPMI score and word embedding similarity, respectively, between the top-10 words of each topic.

 

 

 

 

Topic Diversity Measures

 

 

  • C_v measure is based on a sliding window, one-set segmentation of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity

 

  • Topic Uniqueness (TU) (Dieng et al., 2020)

- Topic Modeling in Embedding Spaces (TACL , 2020)

 

  • Inversed Rank-Biased Overlap (I-RBO) (Terragni et al., 2021; Bianchi et al., 2021a), 

- Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence (ACL 2021)

 

 

 

Top-Purity and Normalized Mutual Information(Top-NMI) as metrics(Nguyen et al., 2018)

 

to evaluate alignment. Both of them range from 0 to 1.

A higher score reflects better clustering performance.

 

- Nguyen et al., 2018 - Improving Topic Models with Latent Feature Word Representations

 

 

 

We further apply the KMeans algorithm to topic proportions z and use the clustered documents to report

purity(Km-Purity) and NMI Km-NMI (Zhao et al., 2020a)

 

 

-Zhao et al., 2020a - Neural Topic Model via Optimal Transport (ICLR , 2021) 

 

 

- 참고문헌 : Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence (Neurips 2021)

'Study > Topic Modeling' 카테고리의 다른 글

Topic Modeling with Contrastive Learning papers  (0) 2024.04.17
Traditional Topic Model  (0) 2024.02.23
Preliminary for Topic Models  (0) 2024.02.23
Topic Modeling Task 정리하기  (0) 2024.02.21
Topic Modeling in NLP 공부 커리큘럼 정리  (2) 2024.01.08
'Study/Topic Modeling' 카테고리의 다른 글
  • Traditional Topic Model
  • Preliminary for Topic Models
  • Topic Modeling Task 정리하기
  • Topic Modeling in NLP 공부 커리큘럼 정리
Seung-won Seo
Seung-won Seo
ML , NLP , DL 에 관심이 많습니다. 반갑습니다 :P
  • Seung-won Seo
    Butterfly_Effect
    Seung-won Seo
    • 분류 전체보기 (78)
      • 일기장 (2)
      • 메모장 (1)
      • Plan (0)
      • To do List (0)
      • Paper Review (33)
      • Progress Meeting (0)
      • Research in NLP (14)
      • Progress for XTM (0)
      • Writing for XTM (0)
      • 논문작성 Tips (12)
      • Study (16)
        • Algorithm (0)
        • ML & DL (7)
        • NLP (2)
        • Statistics (1)
        • Topic Modeling (6)
  • 링크

  • hELLO· Designed By정상우.v4.10.3
Seung-won Seo
Automatic Evaluation Metrics for Topic Modeling
상단으로

티스토리툴바