Doctor of Philosophy (PhD), Ohio University, 2018, Electrical Engineering & Computer Science (Engineering and Technology)
The central dogma of molecular biology states that DNA is transcribed into RNA, which is then translated into proteins. The flow of genetic information in time and space is orchestrated by complex regulatory mechanisms. With the advent of modern biotechnology, our understanding of genomics, transcriptomics, and proteomics has deepened. However, bioinformatic tools for biomarker discovery in the different types of omics are still lacking. To address these issues, we developed novel algorithmic methods for three primary omics. Proteins are the main executor of cellular functions. In the proteomic level, we developed machine learning models for early diagnosis of type 2 diabetes based on the abundance of post-translational modifications (PTMs). Our models can interpret mass spectrometry data and perform integrative analysis together with clinical parameters such as HbA1C and fasting plasma glucose. In the results, we identified glycated lysine-141 of haptoglobin to be a potential biomarker. Gene regulation is conducted by cis-regulatory elements and transcription factors. In the transcriptomic level, we developed Emotif Alpha bioinformatic pipeline for DNA motif discovery and selection using RNA-seq, ChIP-seq, and gene homology data. We applied this pipeline to multiple species, including human, mouse, plants, and nematodes. The discovered motifs were validated using Gaussia Luciferase (GLuc) reporter. The 3D genome architecture in the nucleus involves spatial organization of nuclear bodies such as the histone locus body (HLB). In the 3D genomics level, we developed a bioinformatic pipeline for characterizing locus-specific chromatin interactions. Specifically, we integrated Hi-C, GAM, and SPRITE data and identified complex chromatin organization signature of the Hist1 cluster in mouse embryonic stem cell (mESC). In addition, we performed network hub analysis and identified hubs of diverse functions. These hubs contained not only histone genes and other active genes, b (open full item for complete abstract)
Committee: Lonnie Welch (Advisor); Razvan Bunescu (Committee Member); Liu Jundong (Committee Member); Frank Drews (Committee Member); Allan Showalter (Committee Member); Shiyong Wu (Committee Member)
Subjects: Bioinformatics; Computer Science