# Topic Modelling

Topic modeling provides methods for automatically organizing, understanding, searching, and summarizing large electronic archives.

It can help with the following:

  • discovering the hidden themes in the collection.
  • classifying the documents into the discovered themes.
  • using the classification to organize/summarize/search the documents.

# Latent Dirchlet Allocation (LDA)

LDA tries to assign documents to topics.

Diff LDA distribution

Triangle in the middle best represents the distribution of words in a document for a topic.

There are three parameters that determine LDA distribution.

two distributions in LDA: topics and words

from two lda distribution, select a topic, then select a word.