Online Latent Dirichlet Allocation with Infinite Vocabulary Ke Zhai zhaike@cs.umd.edu Department of Computer Science, University of Maryland, College Park, MD USA Jordan Boyd-Graber jbg@umiacs.umd.edu iSchool and UMIACS, University of Maryland, College Park, MD USA Abstract Topic models based on latent Dirichlet alloca- tion (LDA) assume a predefined vocabulary. This is reasonable in batch setting