The document discusses topic modeling and classification of short texts. It describes using Latent Dirichlet Allocation (LDA) to extract hidden topics from a large universal text corpus consisting of Wikipedia and MEDLINE articles. These topics are then used as features for a maximum entropy classifier to categorize short texts like tweets and web snippets. Parallelized LDA is implemented using th
![Tokyotextmining#1 kaneyama genta](https://cdn-ak-scissors.b.st-hatena.com/image/square/3b89a579e93bb84693e57e7756e28f0e18338e0c/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Ftokyotextmining1kaneyama-100704100501-phpapp01-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)