Follow
Xiang Cheng
Xiang Cheng
Massachusetts Institute of Technology
Verified email at berkeley.edu - Homepage
Title
Cited by
Cited by
Year
Underdamped Langevin MCMC: A non-asymptotic analysis
X Cheng, NS Chatterji, PL Bartlett, MI Jordan
Conference on learning theory, 300-323, 2018
3492018
Convergence of Langevin MCMC in KL-divergence
X Cheng, P Bartlett
Algorithmic Learning Theory, 186-211, 2018
2242018
Sharp convergence rates for Langevin dynamics in the nonconvex setting
X Cheng, NS Chatterji, Y Abbasi-Yadkori, PL Bartlett, MI Jordan
arXiv preprint arXiv:1805.01648, 2018
1882018
Is there an analog of Nesterov acceleration for gradient-based MCMC?
YA Ma, NS Chatterji, X Cheng, N Flammarion, PL Bartlett, MI Jordan
1722021
Transformers learn to implement preconditioned gradient descent for in-context learning
K Ahn, X Cheng, H Daneshmand, S Sra
Advances in Neural Information Processing Systems 36, 45614-45650, 2023
1452023
Asymptotic behavior of\ell_p-based laplacian regularization in semi-supervised learning
A El Alaoui, X Cheng, A Ramdas, MJ Wainwright, MI Jordan
Conference on Learning Theory, 879-906, 2016
1182016
Optimal dimension dependence of the Metropolis-adjusted Langevin algorithm
S Chewi, C Lu, K Ahn, X Cheng, T Le Gouic, P Rigollet
Conference on Learning Theory, 1260-1300, 2021
772021
Stochastic Gradient and Langevin Processes
X Cheng, D Yin, PL Bartlett, MI Jordan
arXiv preprint arXiv:1907.03215, 2019
58*2019
Exploiting optimization for local graph clustering
K Fountoulakis, X Cheng, J Shun, F Roosta-Khorasani, MW Mahoney
arXiv preprint arXiv:1602.01886, 2016
56*2016
Restart sampling for improving generative processes
Y Xu, M Deng, X Cheng, Y Tian, Z Liu, T Jaakkola
Advances in Neural Information Processing Systems 36, 76806-76838, 2023
422023
Linear attention is (maybe) all you need (to understand transformer optimization)
K Ahn, X Cheng, M Song, C Yun, A Jadbabaie, S Sra
arXiv preprint arXiv:2310.01082, 2023
342023
Transformers implement functional gradient descent to learn non-linear functions in context
X Cheng, Y Chen, S Sra
arXiv preprint arXiv:2312.06528, 2023
312023
Theory and algorithms for diffusion processes on Riemannian manifolds
X Cheng, J Zhang, S Sra
arXiv preprint arXiv:2204.13665, 2022
72022
Efficient Sampling on Riemannian Manifolds via Langevin MCMC
X Cheng, J Zhang, S Sra
Advances in Neural Information Processing Systems, 2022
72022
Fast conditional mixing of mcmc algorithms for non-log-concave distributions
X Cheng, B Wang, J Zhang, Y Zhu
Advances in Neural Information Processing Systems 36, 2024
52024
The Interplay between Sampling and Optimization
X Cheng
University of California, Berkeley, 2020
52020
Riemannian Bilevel Optimization
S Dutta, X Cheng, S Sra
arXiv preprint arXiv:2405.15816, 2024
12024
Graph Transformers Dream of Electric Flow
X Cheng, L Carin, S Sra
arXiv preprint arXiv:2410.16699, 2024
2024
FLAG n’FLARE: Fast Linearly-Coupled Adaptive Gradient Methods
X Cheng, F Roosta, S Palombo, P Bartlett, M Mahoney
International Conference on Artificial Intelligence and Statistics, 404-414, 2018
2018
The system can't perform the operation now. Try again later.
Articles 1–19