Winter 2023
Stanford University
This course is centered on extracting information from unstructured data in language and social networks using machine learning tools. It covers techniques like sentiment analysis, chatbot development, and social network analysis.
The online world has a vast array of unstructured information in the form of language and social networks. Learn how to make sense of it using neural networks and other machine learning tools, and how to interact with humans via language, from answering questions to giving advice!
CS106B. CS 107 can be helpful, but is fine if you haven't had it, we'll cover the required UNIX material. Math 51 can also be helpful, but isn't required, since we will introduce the basic vectors knowledge we need in the class.
Extracting meaning, information, and structure from human language text, speech, web pages, social networks. Introducing methods (string algorithms, edit distance, language modeling, machine learning, logistic regression, neural networks, neural embeddings, inverted indices, collaborative filtering, PageRank), applications (chatbots, sentiment analysis, information retrieval, text classification, social networks, recommender systems), and ethical issues.
There is no required textbook, but I'll expect you to know the textbook/reading material listed above, and will test it on the midterms.
Is 106B the only prereq? Do I need 109 or 221 before I take CS124? : 106B is the only prereq. Taking the course as a sophomore is recommended, but we also get lots of juniors and a reasonable number of frosh; the course is designed to be taken early in your Stanford career. It will help if you have at least done some programming beyond 106B, and is also useful to have had 107 or Math 51, but not required; we'll try to give you pointers to places to make up missing background.
Can I take this course as a non-CS grad student?: Yes, although this course is not appropriate for CS grad students (there are graduate versions of all the material in this course), it's very commonly taken by PhD students in the social sciences or humanities who plan to use text processing methods in their research.