
About Me
I am a 3rd year PhD student in Computer Science at Harvard University supervised by Yue Lu.
My research lies in the interface of machine learning, optimization, and statistics.
I am interested in designing state-of-the-art interpretable machine learning models on large-scale tabular data with provable guarantees. I apply these models toward problems in national security and healthcare.
I am also interested in Riemannian optimization and high-dimensional learning dynamics.
My research has been supported in part by the NSERC Postgraduate Scholarship, the Harvard Dean’s Competitive Fund for Promising Scholarship, NSF, and FQRNT.
Previously, I was a Master's student at McGill University and the Montreal Institute of Learning Algorithms (MILA), supervised by Courtney Paquette and Elliot Paquette. I completed my Bachelor's degree in Statistics and Computer Science at McGill University.
Since May 2025, I have been working with Lawrence Livermore National Laboratory on the development of interpretable machine learning models for problems in the energy sector.
In 2022 and 2023, I interned as a deep learning researcher at ValenceLabs, where I worked on generative models for drug discovery.
Recent Events
Two workshop papers at NeurIPS's OPTML conference
Presented results from our papers "High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems" and "Structured Regularization for Constrained Optimization on the SPD Manifold."
Structured Regularization for SPD Optimization with Side Information
Published at IEEE 60th Allerton Conference on Communication, Control, and Computing. Read here .
Research
2024
Structured Regularization on the Positive Definite Manifold
Authors: Andrew Cheng, Melanie Weber
We developed a framework for structured regularization on the positive definite manifold to induce sparsity in the solution or incorporate side information. Our framework preserves properties that allow provable non-asymptotic convergence to the global optimum of non-convex problems on the SPD manifold.
Under ReviewDisciplined Geodesically Convex Programming
Authors: Andrew Cheng, Vaibhav Dixit, Melanie Weber
We extended the theory of disciplined convex programming to the geodesically convex setting, providing a framework for optimization on Cartan-Hadamard manifolds and focused on the symmetric positive definite manifold.
Under ReviewHigh Dimensional First Order Mini-Batch Algorithms on Quadratic Problem
Authors: Andrew Cheng, Kiwon Lee, Courtney Paquette
We show that general mini-batch first order algorithms (Nesterov acceleration, SGD+M, etc.) under any quadratic statistics \( \mathcal{R}: \mathbb{R}^d \to \mathbb{R} \) concentrates around a deterministic quantity, which is the solution of the Volterra integral equation. Examples of quadratic statistics includes but not limited to: the random features model, training loss and excess risk for empirical risk minimization (in-distribution and generalization error) for ridge regression. By analyzing the Volterra integral equation, we can analyze the optimal mini-batch size (i.e. critical batch size) and the optimal hyperparameters (learning rate, momentum, etc.) for the mini-batch algorithms.
Workshop (Preprint soon)2022
Trajectory of Mini-Batch Momentum
Authors: Kiwon Lee, Andrew Cheng, Elliot Paquette, Courtney Paquette
This work, presented at NeurIPS 2022, explores the dynamics of mini-batch momentum in optimization algorithms used for machine learning. The research provides insights into the trajectory of momentum in stochastic optimization and its convergence properties.
Read the Full Paper (NeurIPS 2022)