About Me
I am a 2nd year PhD student in Computer Science at Harvard University, advised by Melanie Weber. My research lies in the interface of machine learning, optimization, and statistics. My research includes Riemannian optimization and high-dimensional learning dynamics. Also, I am incorporating feature learning of neural networks into regression methods and interested in developing novel optimization algorithms for large-scale models. My research has been supported in part by the NSERC Postgraduate Scholarship, the Harvard Dean’s Competitive Fund for Promising Scholarship, NSF, and FQRNT.
Previously, I was a Master's student at McGill University and the Montreal Institute of Learning Algorithms (MILA), supervised by Courtney Paquette and Elliot Paquette. I completed my Bachelor's degree in Statistics and Computer Science at McGill University.
In 2022 and 2023, I interned as a deep learning researcher at ValenceLabs, where I worked on generative models for drug discovery.
Recent Events
Two workshop papers at NeurIPS's OPTML conference
Presented results from our papers "High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems" and "Structured Regularization for Constrained Optimization on the SPD Manifold."
Structured Regularization for SPD Optimization with Side Information
Published at IEEE 60th Allerton Conference on Communication, Control, and Computing. Read here .
Research
2024
Structured Regularization on the Positive Definite Manifold
Authors: Andrew Cheng, Melanie Weber
We developed a framework for structured regularization on the positive definite manifold to induce sparsity in the solution or incorporate side information. Our framework preserves properties that allow provable non-asymptotic convergence to the global optimum of non-convex problems on the SPD manifold.
Under ReviewDisciplined Geodesically Convex Programming
Authors: Andrew Cheng, Vaibhav Dixit, Melanie Weber
We extended the theory of disciplined convex programming to the geodesically convex setting, providing a framework for optimization on Cartan-Hadamard manifolds and focused on the symmetric positive definite manifold.
Under ReviewHigh Dimensional First Order Mini-Batch Algorithms on Quadratic Problem
Authors: Andrew Cheng, Kiwon Lee, Courtney Paquette
We show that general mini-batch first order algorithms (Nesterov acceleration, SGD+M, etc.) under any quadratic statistics \( \mathcal{R}: \mathbb{R}^d \to \mathbb{R} \) concentrates around a deterministic quantity, which is the solution of the Volterra integral equation. Examples of quadratic statistics includes but not limited to: the random features model, training loss and excess risk for empirical risk minimization (in-distribution and generalization error) for ridge regression. By analyzing the Volterra integral equation, we can analyze the optimal mini-batch size (i.e. critical batch size) and the optimal hyperparameters (learning rate, momentum, etc.) for the mini-batch algorithms.
Workshop (Preprint soon)2022
Trajectory of Mini-Batch Momentum
Authors: Kiwon Lee, Andrew Cheng, Elliot Paquette, Courtney Paquette
This work, presented at NeurIPS 2022, explores the dynamics of mini-batch momentum in optimization algorithms used for machine learning. The research provides insights into the trajectory of momentum in stochastic optimization and its convergence properties.
Read the Full Paper (NeurIPS 2022)