Doctor of Philosophy, The Ohio State University, 2023, Computer Science and Engineering
In many biomedical contexts, multiple types of BDMs (e.g., metabolites, genes, proteins, chromatin states, and DNA methylation sites) associate with one another directly or indirectly in groups or chains to impact phenotype or outcome. Certain significant associations often help in data interpretation and novel hypotheses generation, motivating researchers to identify the most impactful groups of BDM associations between multiple types of data. However, many state-of-the-art models focus either on individual BDM associations independently of one another or implement black box predictors of outcome that are agnostic of BDM associations. Moreover, collection of multiple types of BDMs in a subject (i.e., multi-omics data) is not always feasible, motivating the need to infer one omic type of data from another. This dissertation tackles the related problems of (1) using inter-omics approaches to infer BDM types from other related BDM types in specific contexts, (2) finding groups of multi-omics data BDMs associated with outcome through multivariate statistical analysis and graph-based predictive models, and (3) interpreting groups of multi-omics data BDMs associated with outcome in a functional context using existing knowledge.
This dissertation addresses the problem of using inter-omics approaches to infer BDM types from other related BDM types in two domains of note: (1) regulatory element annotation, and (2) protein abundance prediction. First, this dissertation introduces the Self Organizing Map with Variable Neighborhoods (SOM-VN), designed to annotate regulatory elements across whole human genomes using shapes found in chromatin accessibility assays. The novelty of SOM-VN is that, while most computational tools for annotating regulatory elements require a suite of resource-intensive experimental assays, SOM-VN uses only a single assay to annotate regulatory elements. SOM-VN is validated on chromatin accessibility assays from multiple H1, HeLa, A549, and GM12878 ce (open full item for complete abstract)
Committee: Raghu Machiraju (Advisor); Ewy Mathé (Advisor); Andrew Perrault (Committee Member); Rachel Kopec (Committee Member); Rachel Kelly (Committee Member)
Subjects: Applied Mathematics; Artificial Intelligence; Bioinformatics; Biomedical Research; Biostatistics; Computer Science