CS231n: Deep Learning for Computer Vision

Spring 2022

Stanford University

This is a deep-dive into the details of deep learning architectures for visual recognition tasks. The course provides students with the ability to implement, train their own neural networks and understand state-of-the-art computer vision research. It requires Python proficiency and familiarity with calculus, linear algebra, probability, and statistics.

55 covered concepts

Slides / notes available

No videos available

Assignments available

Other resources available

Course Page

Overview

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice. Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.

Prerequisites

Proficiency in Python
All class assignments will be in Python (and use numpy) (we provide a tutorial here for those who aren't as familiar with Python). If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Javascript) you will probably be fine.
College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51)
You should be comfortable taking derivatives and understanding matrix vector operations and notation.
Basic Probability and Statistics (e.g. CS 109 or other stats course)
You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc.

Learning objectives

No data.

Textbooks and other notes

No data

Courseware availability

Lecture slides and course materials available at Schedule

No videos available

Assignments available at Schedule

Projects available at Final projects

Covered concepts

3D CNNs 3D Shape Representations Activation Functions AdaGrad Adam Adversarial Examples AlexNet, VGG, GoogLeNet, ResNet Backpropagation Batch Normalization Contrastive Learning Convolution Data Augmentation Data Processing DeepDream and Style Transfer Feature visualization and inversion Gated recurrent unit (GRU)Generative adversarial network (GAN)Higher-level representations Hyperparameter Tuning Image Captioning Image Features K-Nearest Neighbors Language Modeling Learning Rate Schedules Linear Classifiers Long Short-Term Memory (LSTM)Momentum Multi-layer Perceptron Multimodal Video Understanding Multisensory Supervision Neural Implicit Representations Object Detection Optical Flow Pixel RNN, Pixel CNN Pooling Pretext Tasks RNN Self-Attention Semantic/Instance/Panoptic segmentation Sequence-to-sequence (Seq2Seq)Shape Reconstruction Single-stage detectors Softmax Loss Stereo Vision Stochastic gradient descent (SGD)Supervised learning Support Vector Machine (SVM)Transfer learning Transformer (machine learning model)Two-stage detectors Two-stream networks Unsupervised learning Variational autoencoder (VAE)Video classification Weight initialization

About Feedback

Discord

CS231n: Deep Learning for Computer Vision

Overview

Prerequisites

Learning objectives

Textbooks and other notes

Other courses in Deep Learning

CS 182/282A: Deep Neural Networks

CS 230 Deep Learning

CSE 490 G1 / 599 G1 Introduction to Deep Learning

CS 330 Deep Multi-Task and Meta Learning

CS 224V Conversational Virtual Assistants with Deep Learning

Courseware availability

Covered concepts