Algorithms for Data Science
CS 2243, Fall 2024

Course Details

This is a graduate topics class on algorithmic challenges in modern machine learning and data science. We will touch upon a number of domains (generative modeling, deep learning theory, robust statistics, Bayesian inference) and frameworks for algorithm design (spectral/tensor methods, gradient descent, message passing, MCMC, diffusions), focusing on provable guarantees. The theory draws upon a range of techniques from stochastic calculus, harmonic analysis, statistical physics, algebra, and beyond. We will also explore the myriad modeling challenges in building this theory and prominent paradigms (average-case complexity, smoothed complexity, oracles) for going beyond traditional worst-case analysis.

Time/Location: MW 2:15-3:30, SEC 1.413

Instructor: Sitan Chen (sitan@seas.harvard.edu)
Office hours: SEC 3.325, Monday 5:30-6:30

Teaching Fellows:

  • Weiyuan Gong (wgong@g.harvard.edu), Office hours: Tuesday 4-5, SEC 3.317
  • Marvin Li (marvinli@college.harvard.edu), Office hours: Friday 4-5, SEC 4.307 + 4.308
Pset office hour: Tuesday 10am-12pm, SEC 6.301 + 6.302

Canvas (for lecture recordings only)

Gradescope

Ed (discussion)

Course Policies: See syllabus for detailed overview.


Announcements

  • Pset 3 due October 29, 11:59pm.

  • Pset 2 due October 9, 11:59pm.

  • Pset 1 due September 20, 11:59pm.

  • Notifications of course placement will be sent out by Aug 31 at the latest.

  • Pset 0 and course application form due September 3, 4:59pm


Assignments

Assignments will be posted below and in the course Overleaf when they become available.

  • Pset 0 (due 09/03 at 4:59pm): PDF
  • Pset 1 (due 09/20 at 11:59pm): PDF
  • Pset 2 (due 10/09 at 11:59pm): PDF
  • Pset 3 (due 10/29 at 11:59pm): PDF


Miscellaneous Materials


Lectures

Date Topic Lecture Notes Resources
Sep 4 Logistics, vignette: diffraction limit and learning theory instructor notes, slides
Sep 9 Tensors I: Jennrich's algorithm, applications (super-resolution, Gaussian mixtures, independent component analysis) instructor notes, slides, scribe notes
Sep 11 Tensors II: Iterative methods (gradient descent, tensor power method, alternating least squares) instructor notes, slides, scribe notes
Sep 16 Tensors III: overcomplete tensor decomposition via smoothed analysis instructor notes, slides, scribe notes
Sep 18 Sum-of-squares I: pseudo-distributions, sum-of-squares proofs instructor notes, slides
Sep 23 Sum-of-squares II: robust regression and certifiable hypercontractivity instructor notes, scribe notes
Sep 25 Sum-of-squares III: mixtures of spherical Gaussians, entropy maximization instructor notes, slides, scribe notes
Sep 30 Supervised learning I: PAC learning, Fourier analysis, semirandom noise slides, scribe notes
Oct 2 Supervised learning II: noise sensitivity, one-hidden layer MLPs, Hermite analysis, CSQ algorithms instructor notes, slides, scribe notes
Oct 7 Supervised learning III: filtered PCA instructor notes, slides, scribe notes
Oct 9 Supervised learning IV: linearized networks instructor notes, slides, scribe notes