The multi-armed bandit problem is a classic reinforcement learning problem that models the exploration-exploitation tradeoff dilemma. It is used to allocate resources between competing choices in order to maximize expected gain, and has been applied to problems such as managing research projects. Herbert Robbins and John C. Gittins have both published work on the problem.
University of Washington
Winter 2022
This course dives deep into the role of probability in the realm of computer science, exploring applications such as algorithms, systems, data analysis, machine learning, and more. Prerequisites include CSE 311, MATH 126, and a grasp of calculus, linear algebra, set theory, and basic proof techniques. Concepts covered range from discrete probability to hypothesis testing and bootstrapping.
No concepts data
+ 41 more concepts