I am an assistant professor in the Department of Biostatistics at Harvard T.H. Chan School of Public Health.
My research aims to develop a new generation of inference methods and theory for modern statistics and machine learning, especially focusing on:
- Combinatorial functionals like connectivity, degree, and other topological structures of graphs, ranking, clustering, hyper graphs, etc;
- Complex data structures like high dimensionality, heterogeneity, nonlinearity, heavy-tailness, time-dependency, etc;
- Complicated algorithms like distributed algorithms, nonconvex optimization, kernel methods, etc.
Papers [by Topic]
Lagrangian Inference for Ranking Problems
Operations Research, 2022+. |
StarTrek: Combinatorial Variable Selection with False Discovery Rate Control
Annals of Statistics, Under revision, 2022+ |
Multiview Incomplete Knowledge Graph Integration with Application to Cross-institutional EHR Data Harmonization
(*: co-senior author) Journal of Biomedical Informatics, Under revision, 2022+ |
Progression of traction bronchiectasis/bronchiolectasis in interstitial lung abnormalities is associated with increased all-cause mortality: Age Gene/Environment Susceptibility-Reykjavik Study.
European journal of radiology open 8 100334. |
Interstitial lung abnormalities in patients with stage I non-small cell lung cancer are associated with shorter overall survival: the Boston lung cancer study.
Cancer Imaging 21, no. 1 1-7. |
Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation
NeurIPS 2020. |
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model In International Conference on Machine Learning, pp. 4901-4910 |
Estimating and inferring the maximum degree of stimulus-locked time-varying brain connectivity networks.
Biometrics. Jun;77(2):379-390. |
StarTrek: Combinatorial Variable Selection with False Discovery Rate Control
, Annals of Statistics, Under revision, 2022+ |
Combinatorial Inference for Graphical Models
(*: equal contribution) Annals of Statistics, 47(2), pp.795-827. [Arxiv] |
Distributed Testing and
Estimation under Sparse High Dimensional Models
(alphabetical order) Annals of Statistics, 46(3), 1352-1382. [Arxiv] |
Post-Regularization Inference for Dynamic Nonparanormal Graphical Models
Journal of Machine Learning Research, to appear [Arxiv] |
Provable Sparse Tensor Decomposition
Journal of the Royal Statistical Society: Series B, 2016 [Arxiv][Link] |
Nonparametric Heterogeneity Testing For Massive Data
Technical report [Arxiv] |
Graphical Fermat's Principle and Triangle-Free Graph Estimation
Technical report [Arxiv] |
Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization
IEEE Transactions on Information Theory, 65(6):3489-3514, 2019. [Arxiv] |
Adaptive Inferential Method for Monotone Graph Invariants
Techinical Report, 2017 [Arxiv] [R package] ICSA 2017 Student Paper Award |
Inter-Subject Analysis: Inferring Sparse Interactions with Dense Intra-Graphs
." Journal of the American Statistical Association (2020): 1-57. [Arxiv] ICSA 2017 Student Paper Award |
Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model
Journal of the American Statistical Association, 92:4, pages 875-893. [Arxiv] [PDF] ASA Best Student Paper in Nonparametric Statistics Finalist |
Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tails
IEEE Transactions on Information Theory. vol. 67, no. 8, pp. 5283-5304, Aug. 2021, doi: 10.1109/TIT.2021.3088381. [PDF] |
Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming
(alphabetical order) Splitting Methods in Communication, Imaging, Science, and Engineering (In Roland Glowinski, Stanley J. Osher, Wotao Yin (Eds.)), Springer, 2017 [Optimization Online] |