In part 2 of the word2vec tutorial (here’s part 1), I’ll cover a few additional modifications to the basic skip-gram model which are important for actually making it feasible to train. When you read the tutorial on the skip-gram model for Word2Vec, you may have noticed something–it’s a huge neural network! In the example I gave, we had word vectors with 300 components, and a vocabulary of 10,000 w