I am an assistant professor in the Department of Biostatistics at Harvard T.H. Chan School of Public Health.
My research aims to develop a new generation of inference methods and theory for modern statistics and machine learning, especially focusing on:
- Combinatorial functionals like connectivity, degree, and other topological structures of graphs, ranking, clustering, hyper graphs, etc;
- Complex data structures like high dimensionality, heterogeneity, nonlinearity, heavy-tailness, time-dependency, etc;
- Complicated algorithms like distributed algorithms, nonconvex optimization, kernel methods, etc.
Papers [by Topic]
ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis
Submitted, 2023. |
Fast Distributed Principal Component Analysis of Large-Scale Federated Data
Submitted, 2023. |
Knowledge Graph Embedding with Electronic Health Records Data via Latent Graphical Block Model
Submitted, 2023. |
Multi-source Learning via Completion of Block-wise
Overlapping Noisy Matrices
Under revision, 2023. |
Federated Offline Reinforcement Learning
Under revision, 2023. |
Combinatorial-Probabilistic Trade-Off: Community Properties Test in the Stochastic Block Models
International Conference on Learning Representations (spotlight paper), 2023. |
Inference on the optimal assortment in the multinomial logit model
ACM Conference on Economics and Computation, 2023. |
Lagrangian Inference for Ranking Problems
Operations Research 71.1 (2023): 202-223. |
Graph over-parameterization: Why the graph helps the training of deep graph convolutional network
Neurocomputing 534 (2023): 77-85. |
Multimodal representation learning for predicting molecule–disease relations
Bioinformatics, 2023. |
StarTrek: Combinatorial Variable Selection with False Discovery Rate Control
Under revision, 2023. |
Multiview Incomplete Knowledge Graph Integration with Application to Cross-institutional EHR Data Harmonization
(*: co-senior author) Journal of Biomedical Informatics 133 (2022): 104147. |
Penalized estimation of frailty-based illness–death models for semi-competing risks Biometrics, 1– 13, 2022 |
Progression of traction bronchiectasis/bronchiolectasis in interstitial lung abnormalities is associated with increased all-cause mortality: Age Gene/Environment Susceptibility-Reykjavik Study.
European journal of radiology open 8 100334, 2022 |
Clinical Knowledge Extraction via Sparse Embedding Regression (KESER) with Multi-Center Large Scale Electronic Health Record Data.
, NPJ digital medicine 4, no. 1 (2021): 151. |
Interstitial lung abnormalities in patients with stage I non-small cell lung cancer are associated with shorter overall survival: the Boston lung cancer study.
Cancer Imaging 21, no. 1 1-7. |
Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation
NeurIPS 2020. |
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model In International Conference on Machine Learning, pp. 4901-4910 |
Estimating and inferring the maximum degree of stimulus-locked time-varying brain connectivity networks.
Biometrics. Jun;77(2):379-390. |
Combinatorial Inference for Graphical Models
(*: equal contribution) Annals of Statistics, 47(2), pp.795-827. [Arxiv] |
Distributed Testing and
Estimation under Sparse High Dimensional Models
(alphabetical order) Annals of Statistics, 46(3), 1352-1382. [Arxiv] |
Post-Regularization Inference for Dynamic Nonparanormal Graphical Models
Journal of Machine Learning Research, 2018 [Arxiv] |
Provable Sparse Tensor Decomposition
Journal of the Royal Statistical Society: Series B, 2016 [Arxiv][Link] |
Nonparametric Heterogeneity Testing For Massive Data
Technical report [Arxiv] |
Graphical Fermat's Principle and Triangle-Free Graph Estimation
Technical report [Arxiv] |
Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization
IEEE Transactions on Information Theory, 65(6):3489-3514, 2019. [Arxiv] |
Adaptive Inferential Method for Monotone Graph Invariants
Techinical Report, 2017 [Arxiv] [R package] ICSA 2017 Student Paper Award |
Inter-Subject Analysis: Inferring Sparse Interactions with Dense Intra-Graphs
." Journal of the American Statistical Association (2020): 1-57. [Arxiv] ICSA 2017 Student Paper Award |
Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model
Journal of the American Statistical Association, 92:4, pages 875-893. [Arxiv] [PDF] ASA Best Student Paper in Nonparametric Statistics Finalist |
Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tails
IEEE Transactions on Information Theory. vol. 67, no. 8, pp. 5283-5304, Aug. 2021, doi: 10.1109/TIT.2021.3088381. [PDF] |
Application of the Strictly Contractive Peaceman-Rachford Splitting Method to Multi-block Separable Convex Programming
(alphabetical order) Splitting Methods in Communication, Imaging, Science, and Engineering (In Roland Glowinski, Stanley J. Osher, Wotao Yin (Eds.)), Springer, 2017 [Optimization Online] |