Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Parametric and Non-Parametric Methods for Statistical Network Inference

Abstract Details

2023, Doctor of Philosophy, Ohio State University, Statistics.
Network data, prevalent in domains like social networks, brain imaging, and transportation, have attracted significant research interest from statisticians. This dissertation concentrates on emerging challenges in network analysis, including model-free network comparison, computational efficient methods for large and sparse networks, and conformal prediction in networks with missing values. Our first focus is β model, a pivotal degree-driven model. Current solutions do not scale efficiently, and their theoretical foundations are often restricted to dense networks. In this dissertation, we propose a new 2 penalized MLE scheme, coupled with a novel algorithm capable of handling large, sparse networks efficiently. We present new error bounds on β-models under much weaker assumptions than best known results in literature; we also establish new lower-bounds and new asymptotic normality results; under proper parameter sparsity assumptions, we show the first local rate-optimality result in 2 norm; distinct from existing literature, our results cover both small and large regularization scenarios and reveal their distinct asymptotic dependency structures. Next, we delve into the intricate task of two-sample hypothesis testing for network comparisons. Major challenges include: potentially different sizes and sparsity levels; non-repeated observations of adjacency matrices; computational scalability; and theoretical investigations, especially on finite-sample accuracy and minimax optimality. We propose the first provably higher-order accurate two-sample inference method by comparing network moments. We make weak modeling assumptions and can effectively handle networks of different sizes and sparsity levels. We establish strong finite-sample theoretical guarantees, including rate-optimality properties. Our method is easy to implement and computes fast. We also devise a novel nonparametric framework of offline hashing and fast querying particularly effective for maintaining and querying very large network databases. We then pivot to U-statistics, cornerstone tools in statistical learning that are beleaguered by scalability challenges. Contrary to the conventional focus on power analysis, we shed light on risk control accuracy. In this dissertation, we present a groundbreaking statistical inference procedure that assures higher-order accurate risk control for incomplete U-statistics. This result sheds light on the trade-off between risk control accuracy and speed, a first in literature. The study encompasses both non-degenerate and degenerate U-statistics and network moments, with empirical validation of the theory. Lastly, we focus on conformal entry prediction in row/column-exchangeable matrices. Though conformal prediction is a renowned distribution-free method used in statistical learning, its application in the matrix context is relatively uncharted. We precisely define this problem, differentiate it from related ones, and set boundaries between feasible and infeasible objectives. We introduce two algorithms: one rapidly emulates full conformal prediction, and the other uses algorithmic stability for swift computation. Both are adept at ensuring coverage validity even with arbitrary missing patterns, and we detail the effects of missing data on prediction accuracy.
Yuan Zhang (Advisor)
Subhadeep Paul (Committee Member)
Yoonkyung Lee (Committee Member)
Dena Asta (Committee Member)
215 p.

Recommended Citations

Citations

  • Shao, M. (2023). Parametric and Non-Parametric Methods for Statistical Network Inference [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1699978645515057

    APA Style (7th edition)

  • Shao, Meijia. Parametric and Non-Parametric Methods for Statistical Network Inference. 2023. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1699978645515057.

    MLA Style (8th edition)

  • Shao, Meijia. "Parametric and Non-Parametric Methods for Statistical Network Inference." Doctoral dissertation, Ohio State University, 2023. http://rave.ohiolink.edu/etdc/view?acc_num=osu1699978645515057

    Chicago Manual of Style (17th edition)