Language Models

Word Vectors

Word Window Classification

Backpropagation

Neural network

Dependency Parsing

Recurrent neural network (RNN)

Sequence-to-sequence (Seq2Seq)

Subword Models

Self-Attention

Transformer (machine learning model)

Natural Language Generation

Pretraining

Reinforcement Learning from Human Feedback (RLHF)

Question Answering (QA)

Convolutional neural network (CNN)

Tree Recursive Neural Networks

Constituency Parsing

Code Generation

Multimodal Deep Learning

Coreference Resolution

CS 224N: Natural Language Processing with Deep Learning Stanford University Winter 2023 CS 224N provides an in-depth introduction to neural networks for NLP, focusing on end-to-end neural models. The course covers topics such as word vectors, recurrent neural networks, and transformer models, among others. Natural language processing (NLP) is a crucial part of artificial intelligence (AI), modeling how people share information. In recent years, deep learning approaches have obtained very high performance on many NLP tasks. In this course, students gain a thorough introduction to cutting-edge neural networks for NLP. ## What is this course about?

Natural language processing (NLP) or computational linguistics is one of the most important technologies of the information age. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, politics, etc. In the last decade, deep learning (or neural network) approaches have obtained very high performance across many different NLP tasks, using single end-to-end neural models that do not require traditional, task-specific feature engineering. In this course, students will gain a thorough introduction to cutting-edge research in Deep Learning for NLP. Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models, using the Pytorch framework.

>  “Take it. CS221 taught me algorithms. CS229 taught me math. CS224N taught me how to write machine learning models.” – A CS224N student on Carta - **Proficiency in Python**
    All class assignments will be in Python (using NumPy and PyTorch). If you need to remind yourself of Python, or you're not very familiar with NumPy, you can come to the Python review session in week 1 (listed in the schedule). If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Java/Javascript), you will probably be fine.
- **College Calculus, Linear Algebra** (e.g. MATH 51, CME 100)
    You should be comfortable taking (multivariable) derivatives and understanding matrix/vector notation and operations.

- **Basic Probability and Statistics** (e.g. CS 109 or equivalent)
    You should know the basics of probabilities, gaussian distributions, mean, standard deviation, etc.

- **Foundations of Machine Learning** (e.g. CS221, CS229, CS230, or CS124)
    We will be formulating cost functions, taking derivatives and performing optimization with gradient descent. If you already have basic machine learning and/or deep learning knowledge, the course will be easier; however it is possible to take CS224n without it. There are many introductions to ML, in webpage, book, and video form. One approachable introduction is Hal Daumé’s in-progress A Course in Machine Learning. Reading the first 5 chapters of that book would be good background. Knowing the first 7 chapters would be even better! ### Reference Texts

The following texts are useful, but none are required. All of them can be read free online.

- Dan Jurafsky and James H. Martin. [Speech and Language Processing (3rd ed. draft)](https://web.stanford.edu/~jurafsky/slp3/)
- Jacob Eisenstein. [Natural Language Processing](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- Yoav Goldberg. [A Primer on Neural Network Models for Natural Language Processing](http://u.cs.biu.ac.il/~yogo/nnlp.pdf)
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. [Deep Learning](http://www.deeplearningbook.org/)
- Delip Rao and Brian McMahan. [Natural Language Processing with PyTorch](http://library.stanford.edu/sfx?genre=book&atitle=&title=Natural%20language%20processing%20with%20PyTorch%20:%20build%20intelligent%20language%20applications%20using%20deep%20learning%20/&isbn=9781491978207&volume=&issue=&date=20190101&aulast=Rao,%20Delip,,%20author.&spage=&pages=&sid=EBSCO:VLeBooks:edsvle.AH35866319) (requires Stanford login).
- Lewis Tunstall, Leandro von Werra, and Thomas Wolf. [Natural Language Processing with Transformers](https://transformersbook.com/)

If you have no background in neural networks but would like to take the course anyway, you might well find one of these books helpful to give you more background:

- Michael A. Nielsen. [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com)
- Eugene Charniak. [Introduction to Deep Learning](https://mitpress.mit.edu/books/introduction-deep-learning)

CS 224N: Natural Language Processing with Deep Learning

Natural Language Processing

Supervised learning

Classification

Linear regression

Training/Validation/Testing

Perceptron

MNIST

Loss function

Optimization

Automatic differentiation

Matrix representation of Neural Networks

Graphics Processor (GPU)

TensorFlow

Multilayer perceptron

Activation Functions

Overfitting

Regularization (mathematics)

Word Embeddings

Feedforward neural network

Long Short-Term Memory (LSTM)

Gated recurrent unit (GRU)

Machine Translation

Attention (machine learning)

Scaling Deep Learning Systems

Interpretation of Neural Networks

Unsupervised learning

Autoencoders

Variational autoencoder (VAE)

Generative adversarial network (GAN)

Deepfake

Reinforcement learning (RL)

Value iteration

Deep Q-learning

Policy Gradient Methods

Actor-Critic Methods

Graph neural network (GNN)

CSCI 1470/2470 Deep Learning Brown University Spring 2022 Brown University's Deep Learning course acquaints students with the transformative capabilities of deep neural networks in computer vision, NLP, and reinforcement learning. Using the TensorFlow framework, topics like CNNs, RNNs, deepfakes, and reinforcement learning are addressed, with an emphasis on ethical applications and potential societal impacts. Over the past few years, Deep Learning has become a popular area, with deep neural network methods obtaining state-of-the-art results on applications in computer vision (Self-Driving Cars), natural language processing (Google Translate), and reinforcement learning (AlphaGo). These technologies are having transformative effects on our society, including some undesirable ones (e.g. deep fakes).

This course is there to give students a practical understanding of how Deep Learning works, how to implement neural networks, and how to apply them ethically. We introduce students to the core concepts of deep neural networks and survey the techniques used to model complex processes within the contexts of computer vision and natural language processing.

Throughout the course, we emphasize and require students to think critically about potential ethical pitfalls that can result from mis-application of these powerful models. The course is taught using the Tensorflow deep learning framework.
 By the end of this course, you will be able to:

- Learn about the fundamental algorithms that underly all modern deep learning models.
- Implement different types of deep learning models in Tensorflow.
- Think critically about using a deep learning model for a task and its potential societal impact.
- Collaborate with classmates on a team project to apply deep learning models to task of your
choice.
- Communicate your findings (both positive and negative results are encouraged) through pre-
sentations. - A basic programming course: (CSCI 0150, 0170 or 0190)
- A linear algebra course: (CSCI 0530, MATH 0520 or 0540)
- A stats / probability course: (CSCI 0220, 1450, 0450, MATH 1610, APMA 1650 or 1655)

Exceptions may be possible for those missing one of these prerequisites if (a) the student has taken
another course which covers similar material, or if (b) the student will be concurrently taking the
prerequisite. If either of these situations applies to you, use the “Request Override” feature in
Courses@Brown to request an override code (and explain why you believe your situation merits
one). ### Textbook

None required. Students are encouraged to refer to the following textbook, which is available online:

- [Deep Learning](https://www.deeplearningbook.org/), by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

CSCI 1470/2470 Deep Learning

Deep Learning

Language model

Language Models

2 courses cover this concept

CS 224N: Natural Language Processing with Deep Learning

CSCI 1470/2470 Deep Learning