PhD, University of Cincinnati, 2022, Business: Business Administration
High-dimensional data analysis has played an essential role in modern scientific discoveries. Identifying important predictors among many candidate features is a challenging yet crucial problem. This dissertation consists of two essays that study the inference for high-dimensional linear models and the distress risk prediction in finance.
Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to quantify. A critical question remains unsettled; that is, how to embed the model selection uncertainty into a simultaneous inference of the model coefficients? Is it even possible? In Essay I, we propose a new type of simultaneous confidence intervals --- sparsified simultaneous confidence intervals. Our intervals divide the covariates into three groups --- unimportant, plausible, and significant covariates --- offering more insights about the true model. Specifically, the upper and lower bounds of the intervals of the unimportant covariates are shrunken to zero (i.e., [0,0]), meaning these covariates should be excluded from the final model, while the intervals of plausible or significant covariates are either containing zero (e.g., [-1,1] or [0,1]) or not containing zero (e.g., [2,3]). The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty. We establish desirable asymptotic properties for the proposed method, develop intuitive graphical tools for visualization, and justify its superior performance through simulation and real data analysis.
Essay II studies distress risk prediction, which is vitally important for risk management and asset pricing. In this Essay, we distinguish two types of events of financial distress, bankruptcy and delisting due to other failures, for the first time. They are two closely related yet sharply different distress events. Using a state-of-the-art adaptive Lasso (open full item for complete abstract)
Committee: Yan Yu Ph.D. (Committee Member); Chen Xue Ph.D. (Committee Member); Yichen Qin Ph.D. (Committee Member); Dungang Liu Ph.D. (Committee Member)
Subjects: Statistics