Gradient descent

Histories of AI

Societal impacts of AI

Vector (mathematics and physics)

Dot product

Geometric interpretations

Taking gradients

Discrete random variables

Probability distributions

Mean

Variance

Marginal distributions

Conditional distribution

Big-O notation

Computational complexity

Recurrence relation

Dynamic programming

Continuous optimization

Objective functions

Machine learning

Loss minimization

Hinge loss

Equitable performance

Design and organization of features

Computation graphs

Deep learning models composition

Cross-validation (statistics)

Search problem

Breadth-first search (BFS)

Uniform cost search (UCS)

UCS heuristics

Markov Decision Process (MDP)

Transportation problem

Discounting factor

Reinforcement learning (RL)

Model-free Monte Carlo

Q-learning

Function approximation

Game theory

Minimax algorithm

Temporal difference (TD) learning

Variable-based models

Backtracking search

Gibbs sampling

Bayesian network

Laplace smoothing

Logic

Entailment

Satisfiability

Soundness

Modus ponens

First-order logic

Unification (computer science)

Python

Linear regression

Linear classification

Stochastic gradient descent (SGD)

Non-linear functions

Neural network

Backpropagation

Generalization

K-means clustering

Exhaustive search

Depth-first search (DFS)

UCS correctness

Relaxed search problems

Dice game

Policy evaluation

Value iteration

Model-based Monte Carlo

State–action–reward–state–action (SARSA)

Epsilon-greedy exploration

Deep Reinforcement Learning

Halving game

Alpha-beta pruning

Nash equilibrium

Factor graphs

AC-3 algorithm

Markov random field

Hidden Markov Model (HMM)

Maximum Likelihood Estimation (MLE)

Propositional calculus (Propositional logic)

Contradiction

Model checking

Completeness

Horn clauses

Substitution (logic)

Skolem functions

CS 221 Artificial Intelligence: Principles and Techniques Stanford University Autumn 2022-2023 Stanford's CS 221 course teaches foundational principles and practical implementation of AI systems. It covers machine learning, game playing, constraint satisfaction, graphical models, and logic. A rigorous course requiring solid foundational skills in programming, math, and probability. The goal of artificial intelligence (AI) is to tackle complex real-world problems with rigorous mathematical tools. In this course, you will learn the foundational principles and practice implementing various AI systems. Specific topics include machine learning, search, Markov decision processes, game playing, constraint satisfaction, graphical models, and logic.  This course is fast-paced and covers a lot of ground, so it is important that you have a solid foundation in a number of areas. Here are the basic skills that you need and the classes that teach those skills:

- Programming (ideally Python): [CS 106A](http://www.stanford.edu/class/cs106a/), [CS 106B](http://www.stanford.edu/class/cs106b/), [CS 107](http://www.stanford.edu/class/cs107/)
- Discrete math, mathematical rigor: [CS 103](http://www.stanford.edu/class/cs103/)
- Probability: [CS 109](http://www.stanford.edu/class/cs109/)
- Linear algebra: [Math 51](https://web.stanford.edu/class/math51/textbook.html)

It is less important that you know particular things (e.g., we don't use eigenvectors in this course even though that's a pillar of any linear algebra course), and more important that you've done enough related things that you feel at ease with it. While it is possible to fill in the gaps, this course does move quickly, and ideally you want to be focusing your energy on learning AI rather than catching up on prerequisites. We have made a few [prerequisite modules](https://stanford-cs221.github.io/autumn2022/modules) that you can review to refresh your memory, and the first homework (foundations) will allow you to also get some practice on these basics. ### Further Reading

There are no required textbooks for this class, and you should be able to learn everything from the lecture notes and homeworks. However, if you would like to pursue more advanced topics or get another perspective on the same material, here are some great resources:

- [Russell and Norvig. Artificial Intelligence: A Modern Approach](http://aima.cs.berkeley.edu/). A comprehensive reference for all the AI topics that we will cover.
- [Koller and Friedman. Probabilistic Graphical Models](http://mitpress.mit.edu/books/probabilistic-graphical-models). Covers factor graphs and Bayesian networks (this is the textbook for CS228).
- [Sutton and Barto. Reinforcement Learning: An Introduction](https://mitpress.mit.edu/books/reinforcement-learning). Covers Markov decision processes and reinforcement learning (free online).
- [Hastie, Tibshirani, and Friedman. The Elements of Statistical Learning](https://web.stanford.edu/~hastie/ElemStatLearn/). Covers machine learning from a rigorous statistical perspective (free online).
- [Tsang. Foundations of Constraint Satisfaction](http://www.bracil.net/edward/fcs.html). Covers constraint satisfaction problems (free online).

Note that some of these books use different notation and terminology from this course, so it may take some effort to make the appropriate connections.

CS 221 Artificial Intelligence: Principles and Techniques

Artificial Intelligence

Connectionist Machines

McCullough and Pitt model

Hebb’s learning rule

Rosenblatt’s perceptron

Universal Approximator

Perceptron learning rule

Empirical risk minimization

Optimization

Back-propagation

Momentum

Nestorov

Convergence

Learning Rates

Optimization Algorithms

RMSProp

Acceleration

Overfitting

Regularization (mathematics)

Convolutional neural network (CNN)

Translation Invariance

Cascade Correlation Filters

Recurrent neural network (RNN)

Bidirectional RNNs

Sequence Prediction

Long Short-Term Memory (LSTM)

Connectionist Temporal Classification (CTC)

Representations

Autoencoders

Hopfield Networks

Boltzmann Machines

Normalizing Flows

Variational autoencoder (VAE)

Generative adversarial network (GAN)

Multi-layer Perceptron

Sequence-to-sequence (Seq2Seq)

AdaGrad

11-785 Introduction to Deep Learning Carnegie Mellon University Spring 2020 This course provides a comprehensive introduction to deep learning, starting from foundational concepts and moving towards complex topics such as sequence-to-sequence models. Students gain hands-on experience with PyTorch and can fine-tune models through practical assignments. A basic understanding of calculus, linear algebra, and Python programming is required. “Deep Learning” systems, typified by deep neural networks, are increasingly taking over all AI tasks, ranging from language understanding, and speech and image recognition, to machine translation, planning, and even game playing and autonomous driving. As a result, expertise in deep learning is fast changing from an esoteric desirable to a mandatory prerequisite in many advanced academic settings, and a large advantage in the industrial job market.

In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and be able to apply Deep Learning to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.

If you are only interested in the lectures, you can watch them on the YouTube channel listed below.

### Course description from student point of view

The course is well rounded in terms of concepts. It helps us understand the fundamentals of Deep Learning. The course starts off gradually with MLPs and it progresses into the more complicated concepts such as attention and sequence-to-sequence models. We get a complete hands on with PyTorch which is very important to implement Deep Learning models. As a student, you will learn the tools required for building Deep Learning models. The homeworks usually have 2 components which is Autolab and Kaggle. The Kaggle components allow us to explore multiple architectures and understand how to fine-tune and continuously improve models. The task for all the homeworks were similar and it was interesting to learn how the same task can be solved using multiple Deep Learning approaches. Overall, at the end of this course you will be confident enough to build and tune Deep Learning models.  1. We will be using one of several toolkits (the primary toolkit for recitations/instruction is PyTorch). The toolkits are largely programmed in Python. You will need to be able to program in at least one of these languages. Alternately, you will be responsible for finding and learning a toolkit that requires programming in a language you are comfortable with,
1. You will need familiarity with basic calculus (differentiation, chain rule), linear algebra and basic probability. ### Course description from student point of view

The course is well rounded in terms of concepts. It helps us understand the fundamentals of Deep Learning. The course starts off gradually with MLPs and it progresses into the more complicated concepts such as attention and sequence-to-sequence models. We get a complete hands on with PyTorch which is very important to implement Deep Learning models. As a student, you will learn the tools required for building Deep Learning models. The homeworks usually have 2 components which is Autolab and Kaggle. The Kaggle components allow us to explore multiple architectures and understand how to fine-tune and continuously improve models. The task for all the homeworks were similar and it was interesting to learn how the same task can be solved using multiple Deep Learning approaches. Overall, at the end of this course you will be confident enough to build and tune Deep Learning models.

11-785 Introduction to Deep Learning

Deep Learning

Decision Tree Learning

Margins

Naive Bayes

Maximizing Conditional Likelihood

Kernelizing Algorithms

Primal and Dual Forms

Finite Hypothesis Classes

k-fold cross-validation

Objective-Based Clustering

Principal Component Analysis (PCA)

Semi-Supervised Learning

Learning Linear Separators

Bag of Words

Computer Vision

Generalization and Overfitting

Model Selection and Regularization

Support Vector Machine (SVM)

Minimizing Squared Error

Deep Networks

Adaboost

Learning Representations

Active Learning

Co-training

Conditional Independence

Kernelizing Perceptron

Kernelizing SVM

VC Dimension Based Bounds

Boosting

Hierarchical Clustering

Interactive Learning

Transductive SVM

Bayes' theorem

Perceptron

Maximum A Posteriori (MAP)

Logistic Regression

Kernel (operating system)

Geometric Margins

Sample Complexity

Structural Risk Minimization

Maximizing Data Likelihood

Unsupervised learning

Dimensionality Reduction

Sampling Bias

Convolution

10-401 Introduction to Machine Learning Carnegie Mellon University Spring 2018 A comprehensive exploration of machine learning theories and practical algorithms. Covers a broad spectrum of topics like decision tree learning, neural networks, statistical learning, and reinforcement learning. Encourages hands-on learning via programming assignments.  Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as decision tree learning, Support Vector Machines, neural networks, boosting, statistical learning methods, unsupervised learning, active leaerning, and reinforcement learning. Short programming assignments include hands-on experiments with various learning algorithms.   - Machine Learning, Tom Mitchell. (optional)
- Pattern Recognition and Machine Learning, Christopher Bishop. (optional)
- Machine Learning: A Probabilistic Perspective, Kevin P. Murphy, [available online](https://ebookcentral.proquest.com/lib/cm/detail.action?docID=3339490), (optional)

10-401 Introduction to Machine Learning

Machine Learning

Online Learning

Decision Trees

Clustering

PAC Learning

Error Decomposition

Decision Making

Similarity Learning

Neural Networks Learning

Mathematical Optimization

Multiclass Learning

COS 324 - Introduction to Machine Learning Princeton University Fall 2017 A thorough introduction to machine learning principles such as online learning, decision making, gradient-based learning, and empirical risk minimization. It also explores regression, classification, dimensionality reduction, ensemble methods, neural networks, and deep learning. The course material is self-contained and based on freely available resources. The course provides an introduction to machine learning.

**Topic covered**:

- Online learning and decision making
- Learning from examples and generalization
- Empirical risk minimization and regularization
- Introduction to convex analysis
- Gradient-based learning
- Implementation and analysis of learning algorithms for regression, binary classification, multiclass categorization, and ranking problems
- Dimensionality reduction methods
- Ensemble methods and boosting
- Neural networks and deep learning
- Markov decision precesses    **NOTICE:**  All material of the course is self-contained and based on freely available books and surveys.   
Main references:

- [Understanding Machine Learning: From Theory to Algorithms](http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/), by Shai Shalev-Shwartz and Shai Ben-David
- [Online convex optimization](http://ocobook.cs.princeton.edu/), by Elad Hazan
- [Machine Learning](http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mitchell/ftp/mlbook.html), by Tom Mitchell
- An Introduction to Computational Learning Theory, by Michael Kearns &amp; Umesh Vazirani
- Machine Learning: A Probabilistic Perspective, by Kevin Murphy,

Further advanced references: - [Convex Optimization](http://stanford.edu/~boyd/cvxbook/), by Stephen Boyd and Lieven Vandenberghe
- [Convex optimization: algorithms and complexity](https://arxiv.org/abs/1405.4980), by Sebastien Bubeck
- Artificial Intelligence: A Modern Approach, by Stuart Russell and Peter Norvig

Python Tutorials - [An interactive python tutorial](https://learnpython.org/) from LearnPython.com
- [Tutorial for Python 2.7](https://docs.python.org/2.7/tutorial/) from python.org
- [Tutorial for Python 3](https://docs.python.org/3/tutorial/) from python.org

COS 324 - Introduction to Machine Learning

Gradient descent

4 courses cover this concept

CS 221 Artificial Intelligence: Principles and Techniques

11-785 Introduction to Deep Learning

10-401 Introduction to Machine Learning

COS 324 - Introduction to Machine Learning