ATCS C Learning and Optimization - theory and practice

2018 Spring

Lecturer: Jian Li

TA: Chuheng Zhang  zhangchuheng123 at live dot com

Room: 6B108

time: every monday 9:50am-12:15am

Prerequisite: undergrad machine learning (e.g., machine learing in coursera by Andrew Ng) , some basic knowledge about deep learning

There are some overlaps with my previous course ATCS16 and ATCS17.

We intend to cover a subset of the following topics (tentative):

(1) Classical machine learning and prediction algorithms: Adaboost, gradient boosting, random forestSVD, principle component analysis (PCA)Nonnegative matrix factorization (NMF), topic modeling, matrix completion, dictionary learning, spectral clustering, Gaussian process

(2) Deep Learning: review basics very very quickly (CNN, RNN, autoencoder, GAN please see my undergrad course), word2vec, attention, other GANs, e.g. WGAN, seq2seq, meta-learning, optimization and generalization issues, Deep reinforcement learning

(3) Continuous optimization: Gradient Descent, Stochastic Gradient Descent, mirror descent, variance reduction, proximal method, distributed(ADMM), (Maybe) non-convex optimization, frank-wolfe, Online learning, Universal portfolio, Multiplicative weighting method

I won't stickly follow the above order....I may skip something mentioned above and cover something not mentioned above...It is a graduate course.

I will be talking about several applications of ML and optimization in Finance, and of course in typical CS areas like vision, nlp, social networks as well...

Some knowledge about convex optimization may be useful. See Johns class and a previous course by myself. But it will be fine if you didn't take those courses. Basic machine learning knowledge will be very useful. If you don't know any machine learning, I would suggest you to read some notes from Andrew Ng's undergrad lecture notes

The course is a blending of theory and practice. We will cover both the underlying mathematics as well as interesting heuristics.

A similar previous course.


  1. Homeworks (30pts, 1 homework every two or three weeks, the homeworks may have some small coding assignments)
  2. 10 pts for taking notes: Each student should take notes for 1 or 2 lectures, using LaTex (use this template sample.tex algorithm2e.sty).
  3. Course projects (mid report 5pts, final report 45 pts, presentation 10pts)
  4. No exam. 




Feb 26. (quickly) review supervised learning

Additive model,


Gradient boosting, xgboost

random forests (not covered in class, but you need to know).

Slides (based on this page)


Additive models, Adaboost, gradient boosting: The element of statistical learning, Ch 9 and 10

Random forest: The element of statistical learning, Ch 9 and 10

XGBoost: A Scalable Tree Boosting System

Further reading:

Some learning theory about boosting: Foundation of Machine learning, Ch. 6.

lightGBM (another popular GDBT implementation)

Awesome Random Forest

Mar 5 Online Learning, The expert problem. Multiplicative weights method (with application to zero-sum game), Gradient descent for online learning, Universal Porfolio Multiplicative weights method: a meta-algorithm and its applications. (A survey) Sanjeev Arora, Elad Hazan, and Satyen Kale. [pdf]

The method can be used to solve approximately the zero-sum game and linear program, and is also closely related to Adaboost.

Online convex programming and generalized infinitesimal gradient ascent. M. Zinkevich.

Mar 12 (Cont.)Universal Portfolio, Online-to-Batch, Norm, Fenchel Conjugate, Some basics (strongly convex, Bregman Divergence) Proof follows: Universal Portfolios With and Without Transaction Costs

Online to Batch:

Mar 19 Online Mirror Descent, Follow the Regularized Leader

Mirror Descent

(equivalence of FTRL and OMD)


[Book] The elements of statistical learning [ESL]

[Book] Convex Optimization

[Book] Introduction to online convex optimization

[Book] Learning, Prediction and Games

[Book] All of Statistics: A Concise Course in Statistical Inference

[Book] Optimization Methods in Finance

[Book] Analysis of Financial Time Series

[Book] Statistical Models and Methods for Financial Markets

Ankur Moitra's (MIT) lecture notes (Algorithmic machine learning) lecture notes


Python is the default programming language we will use in the course.

If you haven't use it before, don't worry. It is very easy to learn (if you know any other programming language), and is a very efficient language, especially

for prototyping things related scientific computing and numerical optimization. Python codes are usually much shorter than C/C++ code (the lauguage has done a lot for you). It is also more flexible and generally faster than matlab.

A standard combination for this class is Python+numpy (a numeric lib for python)+scipy (a scientific computing lib for python)+matplotlab (for generating nice plots)

Another somewhat easier way is to install Anaconda (it is a free Python distribution with most popular packages).