
Algorithms for Data Science CS 224, Fall 2023
Course Details
This is a graduate-level topics class on algorithmic challenges arising in modern machine learning and data science more broadly. The course will touch upon a number of well-studied problems (generative modeling, deep learning theory, adversarial robustness, inverse problems, inference) and frameworks for algorithm design (gradient descent, spectral methods, tensor/moment methods, message passing, convex programming hierarchies, Markov chains). As we will see, proving rigorous guarantees for these problems often draws upon a wide range of techniques from stochastic calculus, harmonic analysis, random matrix theory, algebra, and statistical physics. We will also explore the myriad modeling challenges that go into building theory for ML and discuss prominent paradigms (semi-random models, smoothed analysis, oracles) for going beyond traditional worst-case analysis.
Time/Location: MW 2:15-3:30, SEC LL2-224
Instructor: Sitan Chen (sitan@seas.harvard.edu) Office hours: SEC 3.325, Thursday 4-5
Teaching Fellows:
- David Brewster (dbrewster@g.harvard.edu), Office hours: Maxwell-Dworkin 123, Tuesday 5-6
- Depen Morwani (dmorwani@g.harvard.edu), Office hours: SEC 1.404, Wednesday 11-12
- Rosie Zhao (rosiezhao@g.harvard.edu), Office hours: SEC 2.330, Monday 11-12
Canvas (for lecture recordings and announcements)
Course Policies: See syllabus for detailed overview.
Announcements
Pset 1 is out, see below.
See Canvas for link to sign up for pset office hour preferences
Notifications of course placement will be sent out Thursday afternoon, Aug 31.
Pset 0 and course application form due August 29, 11:59pm
Miscellaneous Materials
- Helpful references:
- Roman Vershynin. High-Dimensional Probability (Chapters 1-2, 3-4 also helpful)
- Compilation of recurring notation used in the course
- Some relevant past courses:
- Ankur Moitra. Algorithmic Aspects of Machine Learning
- Tselil Schramm. The Sum-of-Squares Algorithmic Paradigm in Statistics
- Prasad Raghavendra. Efficient Algorithms and Computational Complexity in Statistics
- Sanjeev Arora. Theory of Deep Learning
- Song Mei. Mean Field Asymptotics in Statistical Learning
- Lenka Zdeborova & Florent Krzakala. Statistical Physics For Optimization and Learning
- Ahmed El-Alaoui. Topics in High-Dimensional Inference
- Subhabrata Sen. STAT 217: Topics in High-Dimensional Statistics - Methods from Statistical Physics.
Lectures
Date | Topic | Lecture Notes | Resources |
---|---|---|---|
Sep 6 | Logistics, vignette: diffraction limit and learning theory | instructor notes, slides |
|
Sep 11 | Tensors I: Jennrich's algorithm, applications (super-resolution, Gaussian mixtures, independent component analysis) | instructor notes, slides, scribe notes |
|
Sep 13 | Tensors II: Iterative methods (gradient descent, tensor power method, alternating least squares) | instructor notes, slides, scribe notes |
|
Sep 18 | Tensors III: overcomplete tensor decomposition via smoothed analysis | instructor notes, slides |
|
Sep 20 | Sum-of-squares I: pseudo-distributions, application to robust regression | instructor notes, slides |
|