Want to create an interactive transcript for this episode?
Podcast: Data Skeptic
Episode: word2vec
Description: Word2vec is an unsupervised machine learning model which is able to capture semantic information from the text it is trained on. The model is based on neural networks. Several large organizations like Google and Facebook have trained word embeddings (the result of word2vec) on large corpora and shared them for others to use. The key algorithmic ideas involved in word2vec is the continuous bag of words model (CBOW). In this episode, Kyle uses excerpts from the 1983 cinematic masterpiece War Games, and challenges Linhda to guess a word Kyle leaves out of the transcript. This...