Underdamped Langevin MCMC: A non-asymptotic analysis X Cheng, NS Chatterji, PL Bartlett, MI Jordan Conference on learning theory, 300-323, 2018 | 349 | 2018 |
Convergence of Langevin MCMC in KL-divergence X Cheng, P Bartlett Algorithmic Learning Theory, 186-211, 2018 | 224 | 2018 |
Sharp convergence rates for Langevin dynamics in the nonconvex setting X Cheng, NS Chatterji, Y Abbasi-Yadkori, PL Bartlett, MI Jordan arXiv preprint arXiv:1805.01648, 2018 | 188 | 2018 |
Is there an analog of Nesterov acceleration for gradient-based MCMC? YA Ma, NS Chatterji, X Cheng, N Flammarion, PL Bartlett, MI Jordan | 172 | 2021 |
Transformers learn to implement preconditioned gradient descent for in-context learning K Ahn, X Cheng, H Daneshmand, S Sra Advances in Neural Information Processing Systems 36, 45614-45650, 2023 | 145 | 2023 |
Asymptotic behavior of\ell_p-based laplacian regularization in semi-supervised learning A El Alaoui, X Cheng, A Ramdas, MJ Wainwright, MI Jordan Conference on Learning Theory, 879-906, 2016 | 118 | 2016 |
Optimal dimension dependence of the Metropolis-adjusted Langevin algorithm S Chewi, C Lu, K Ahn, X Cheng, T Le Gouic, P Rigollet Conference on Learning Theory, 1260-1300, 2021 | 77 | 2021 |
Stochastic Gradient and Langevin Processes X Cheng, D Yin, PL Bartlett, MI Jordan arXiv preprint arXiv:1907.03215, 2019 | 58* | 2019 |
Exploiting optimization for local graph clustering K Fountoulakis, X Cheng, J Shun, F Roosta-Khorasani, MW Mahoney arXiv preprint arXiv:1602.01886, 2016 | 56* | 2016 |
Restart sampling for improving generative processes Y Xu, M Deng, X Cheng, Y Tian, Z Liu, T Jaakkola Advances in Neural Information Processing Systems 36, 76806-76838, 2023 | 42 | 2023 |
Linear attention is (maybe) all you need (to understand transformer optimization) K Ahn, X Cheng, M Song, C Yun, A Jadbabaie, S Sra arXiv preprint arXiv:2310.01082, 2023 | 34 | 2023 |
Transformers implement functional gradient descent to learn non-linear functions in context X Cheng, Y Chen, S Sra arXiv preprint arXiv:2312.06528, 2023 | 31 | 2023 |
Theory and algorithms for diffusion processes on Riemannian manifolds X Cheng, J Zhang, S Sra arXiv preprint arXiv:2204.13665, 2022 | 7 | 2022 |
Efficient Sampling on Riemannian Manifolds via Langevin MCMC X Cheng, J Zhang, S Sra Advances in Neural Information Processing Systems, 2022 | 7 | 2022 |
Fast conditional mixing of mcmc algorithms for non-log-concave distributions X Cheng, B Wang, J Zhang, Y Zhu Advances in Neural Information Processing Systems 36, 2024 | 5 | 2024 |
The Interplay between Sampling and Optimization X Cheng University of California, Berkeley, 2020 | 5 | 2020 |
Riemannian Bilevel Optimization S Dutta, X Cheng, S Sra arXiv preprint arXiv:2405.15816, 2024 | 1 | 2024 |
Graph Transformers Dream of Electric Flow X Cheng, L Carin, S Sra arXiv preprint arXiv:2410.16699, 2024 | | 2024 |
FLAG n’FLARE: Fast Linearly-Coupled Adaptive Gradient Methods X Cheng, F Roosta, S Palombo, P Bartlett, M Mahoney International Conference on Artificial Intelligence and Statistics, 404-414, 2018 | | 2018 |