1- Intro and text classification [6]
- Main approaches in NLP
- Text preprocessing
- Feature extraction from text
- Linear models for sentiment analysis
- Hashing trick in spam filtering
- Neural networks for words
- Neural networks for characters
2- NLP with Classification and Vector Spaces
2-1- Sentiment Analysis with Logistic Regression [1]
- Supervised ML & Sentiment Analysis
- Vocabulary & Feature Extraction
- Negative and Positive Frequencies
- Feature Extraction with Frequencies
- Logistic Regression
2-2- Sentiment Analysis with Naïve Bayes [1]
- Bayes’ Rule
- Naïve Bayes
- Laplacian Smoothing
- Log Likelihood
- Error Analysis
2-3- Vector Space Models [1]
- Vector Space Models
- Word by Word and Word by Doc. (Word2vec and doc2vec)
- Euclidean Distance
- Cosine Similarity
- Manipulating Words in Vector Spaces
- PCA Algorithm
- Word analogies [6]
- Topic modeling [6]
- PLSA [6]
2-4- Machine Translation and Document Search [1]
- Transforming word vectors
- K-nearest neighbors
- Hash tables and hash functions
- Locality sensitive hashing
- Multiple Planes
- Approximate nearest neighbors
- Searching documents
3- NLP with Probabilistic Models
3-1- Autocorrect [2]
- Minimum edit distance algorithm
3-2- Part of Speech Tagging and Hidden Markov Models [2]
- Markov Chains
- Markov Chains and POS Tags
- Hidden Markov Models
- Calculating Probabilities
- Populating the Transition Matrix
- Populating the Emission Matrix
- The Viterbi Algorithm
3-3- Autocomplete and Language Models [2]
- N-Grams: Overview
- N-grams and Probabilities
- Sequence Probabilities
- Starting and Ending Sentences
- The N-gram Language Model
- Language Model Evaluation
- Out of Vocabulary Words
- Smoothing
- MEMMs, CRFs and other sequential models for Named Entity Recognition [6]
- Neural Language Models [6]
3-4- Word embeddings with neural networks [2]
- Basic Word Representations
- Word Embeddings
- Continuous Bag-of-Words Model
- Cleaning and Tokenization
- Sliding Window of Words in Python
- Transforming Words into Vectors
- CBOW Model
- Extracting Word Embedding Vectors
- Evaluating Word Embeddings Evaluation
- Evaluating Word Embeddings
- Word2Vec [5]
- GloVe word vectors [5]
- Negative Sampling [5]
4- NLP with Sequence Models [3]
4-1- Neural Networks for Sentiment Analysis [3]
- Neural Networks for Sentiment Analysis
- Trax: Neural Networks
- Dense and ReLU Layers
- Serial Layer
- Training
4-2- Recurrent Neural Networks for Language Modeling [3]
- Traditional Language models
- Recurrent Neural Networks
- Gated Recurrent Units
- Deep and Bi-directional RNNs
4-3- LSTMs and Named Entity Recognition [3]
- RNNs and Vanishing Gradients
- LSTM
- Named Entity Recognition
4-4- Siamese Networks [3]
- Siamese Networks
- Triplets
- One Shot Learning
5- Natural Language Processing with Attention Models [4]
5-1- Attention mechanism [5]
- Beam Search
- Bleu Score
- Attention Model
- Speech recognition
- Trigger Word Detection
5-2- Neural Machine Translation [4]
- Seq2seq
- Alignment
- Encoder-decoder architecture [6]
- Attention
- Setup for Machine Translation
- Training an NMT with Attention
- Evaluation for Machine Translation
- Sampling and Decoding
5-3- Text Summarization [4]
- Transformers vs RNNs
- Dot-Product Attention
- Causal Attention
- Multi-head Attention
- Transformer Decoder
- Transformer Summarizer
5-4- Question Answering [4]
- Transfer Learning in NLP
- ELMo, GPT, BERT, T5
- Bidirectional Encoder Representations from Transformers (BERT)
- Transformer: T5
- Multi-Task Training Strategy
- GLUE Benchmark
- Question Answering
5-5- Chatbot [4]
- Tasks with Long Sequences
- Transformer Complexity
- LSH Attention
- Motivation for Reversible Layers: Memory!
- Reversible Residual Layers
- Reformer