Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Application and Development of Novel Methods for Pathway Analysis and Visualization of the LINCS L1000 Dataset

Abstract Details

2021, PhD, University of Cincinnati, Medicine: Biostatistics (Environmental Health).
The LINCS L1000 dataset is a large-scale compendium that contains records of the cell line specific transcriptional effects of cellular perturbation that was established to provide mechanistic and circuit-level insights with regard to cancer biology. This undertaking is a scaled-up version of the Connectivity Map (CMap) project whose goal was to connect transcriptional signatures of the downstream effects of genetic and small-molecule perturbations in a high-throughput yet cost-effective manner. This was accomplished by profiling a reduced representation of the human transcriptome – nearly 1,000 landmark transcripts whose expression is predictive of roughly 80% of non-measured genes. Whereas the choice to measure a subset of the transcriptome was primarily cost-based, reducing the representation of transcriptional data is a common method for amplifying the signal amidst the noisy background of large datasets. It can also be a valuable tool for making data amenable to a variety of bioinformatics-based analyses, for example, when lists of genes and their direction of regulation is considered based on continuously valued measurements subjected to a significance-based threshold. In the work presented in this document, we subject the records contained in the L1000 dataset to a thresholding procedure and explore how connections between over 2,000 common genetic perturbations differ between a core set of seven cancer cell lines. Specifically, we frame the connections in the context of edges between nodes in a novel adaptation of pathway-level analysis. We begin by conducting a simulation study in order to interrogate the data-generating mechanism best suited to reproduce our data of interest with the least amount of bias. This will be followed by a power analysis to assess the appropriate threshold for edge-based measurements for our dataset. Then, we will demonstrate how these measurements can be incorporated into the topology of cellular signaling pathways and introduce an R Bioconductor package that easily integrates this type of data into pathways from the Kyoto Encyclopedia of Genes and Genomes or KEGG – one of the most widely known online repositories for biological pathways. Finally, we will conduct an edge set enrichment analysis of our data that applies the well-known methodology of gene set enrichment analysis to this novel edge-data type.
Mario Medvedovic, Ph.D. (Committee Chair)
Marepalli Rao, Ph.D. (Committee Member)
John Reichard, PharmD Ph.D. (Committee Member)
Heidi Sucharew, Ph.D. (Committee Member)
200 p.

Recommended Citations

Citations

  • White, S. (2021). Application and Development of Novel Methods for Pathway Analysis and Visualization of the LINCS L1000 Dataset [Doctoral dissertation, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623241379918016

    APA Style (7th edition)

  • White, Shana. Application and Development of Novel Methods for Pathway Analysis and Visualization of the LINCS L1000 Dataset. 2021. University of Cincinnati, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623241379918016.

    MLA Style (8th edition)

  • White, Shana. "Application and Development of Novel Methods for Pathway Analysis and Visualization of the LINCS L1000 Dataset." Doctoral dissertation, University of Cincinnati, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623241379918016

    Chicago Manual of Style (17th edition)