Like Latent Semantic Analysis (LSA) and probabilistic LSA (pLSA) – see my previous post “LSA and pLSA in Python“, Latent Dirichlet Allocation (LDA) is an algorithm which, given a collection of documents and nothing more (no supervision needed), can uncover the “topics” expressed by documents in that collection. LDA can be seen as a Bayesian extension of pLSA. As Blei, the author of LDA, points ou

