AIE1007: Natural Language Processing

What is this course about?

Recent advances have ushered in exciting developments in natural language processing (NLP), resulting in systems that can translate text, answer questions and even hold spoken conversations with us. This course will introduce students to the basics of NLP, covering standard frameworks for dealing with natural language as well as algorithms and techniques to solve various NLP problems, including recent deep learning approaches. Topics covered include language modeling, representation learning, text classification, sequence tagging, machine translation, Transformers, and others.

Grading

  • Assignments (40%): There will be four assignments with both written and programming parts. Each homework is centered around an application and will also deepen your understanding of the theoretical concepts.
    • Assignment 1: language models, text classification (10%)
    • Assignment 2: word embeddings, sequence modeling (10%)
    • Assignment 3: recurrent neural networks, feedforward neural networks (10%)
    • Assignment 4: machine translation, Transformers (10%)
  • Midterm exam (25%): The midterm will test your knowledge and problem-solving skills.
  • Final project (35%): The final project offers you a chance to apply your newly acquired skills towards an in-depth application. You are required to turn in a project proposal and complete a paper written in the style of a conference (e.g., ACL) submission. There will be also project presentations at the end of the semester.
  • Extra credit (5%): For participation in class and Ed discussion. Limited to overall score of max 100%.

Prerequisites:

  • Required: AIE1001,AIE1006, knowledge of probability, linear algebra, multivariate calculus.
  • Proficiency in Python: programming assignments and projects will require use of Python, Numpy and PyTorch.

Reading:

There is no required textbook for this class, and you should be able to learn everything from the lectures and assignments. However, if you would like to pursue more advanced topics or get another perspective on the same material, here are some books (all of them can be read free online):

Schedule

1Introduction to NLP1. Advances in natural language processing
2. Human Language Understanding & Reasoning
n-gram language modelsJ & M 3.1-3.5
2Text classificationNaive Bayes: J & M 4.1-4.6
Logistic regression: J & M 5.1-5.8
Word embeddings 1J & M 6.2-6.4, 6.6
3Word embeddings 21. J & M 6.8, 6.10-6.12
2. Efficient Estimation of Word Representations in Vector Space (original word2vec paper)
3. Distributed representations of words and phrases and their compositionality (negative sampling)
Sequence models 11. J&M 8.1-8.4
2. Michael Collin’s notes on HMMs
4Sequence models 21. Michael Collins’s notes on MEMMs and CRFs
2. Michael Collins’s notes on CRFs
Neural networks for NLPJ&M 7.3-7.5
5Recurrent neural networks 11. J&M 9.1-9.3
2. The Unreasonable Effectiveness of Recurrent Neural Networks
Recurrent neural networks 21. J&M 9.5
2. Understanding LSTM Networks
3. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (GRUs)
4. Simple Recurrent Units for Highly Parallelizable Recurrence (SRUs)
6Midterm
7
8Machine translation1. J&M 13.2
2. Michael Collin’s notes on IBM models 1 and 2
3. Sequence to Sequence Learning with Neural Networks
4. Machine Translation. From the Cold War to Deep Learning.
Seq2seq models + attention1. Neural Machine Translation by Jointly Learning to Align and Translate
2. Effective Approaches to Attention-based Neural Machine Translation
3. Blog post: Visualizing A Neural Machine Translation Model
4. Blog post: Sequence to Sequence (seq2seq) and Attention
10Transformers 11. J&M 10.1
2. Attention Is All You Need
3. The Annotated Transformer
4. The Illustrated Transformer
Transformers 21. Efficient Transformers: A Survey
2. Vision Transformer
11Contextualized representations and pre-training1. Deep contextualized word representations (ELMo)
2. Improving Language Understanding by Generative Pre-Training (GPT)
3. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
4. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
Project proposals
Large language models1. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (Electra)
2. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5)
3. Language Models are Few-Shot Learners (GPT-3)
4. Training language models to follow instructions with human feedback (InstructGPT)
5. GPT-4 Technical Report (GPT-4)
12Part 1: Safe and accessible generative AI for all (Ameet DeshpandeVishvak Murahari)
Part 2: Responsible LLM Development (Peter Henderson)
Toxicity in ChatGPT
Bias runs deep: Implicit reasoning biases in Persona-assigned LLMs
Anthropomorphization of AI: Opportunities and Risks
13Part 1: Driving Progress in Language Modeling Through Better Benchmarking (SWE-bench + SWE-agent) (Ofir Press)
Part 2: Challenges in Human Preference Elicitation and Modeling (Tanya Goyal)
InstructGPT
14No lecture (final project feedback sessions)
Project Poster PresentationsFinal project report

10 NLP Projects for beginners

TOP 8 PYTHON LIBRARIES FOR NATURAL LANGUAGE PROCESSING (NLP)