Doctor of Philosophy, The Ohio State University, 2005, Statistics
Microarray, a bio-technology that allows monitoring of gene expressions for thousands of genes simultaneously, has revolutionized biological and genomic research and holds promising potentials in many real applications, such as drug targeting, gene profiling, disease diagnosis and prognosis, pharmacogenomics, etc. Along with its unprecedented potential, microarray technology presents miscellaneous challenges in statistical analysis of microarray gene expression data. Many sources of extraneous variations are present in a microarray experiment. Adjusting these extraneous variations is critical to the separation of biological signals from artifacts. Moreover, microarray gene expression data typically are of extremely large dimension, consisting of tens of thousands of observations. Computational efficiency in statistical analysis is therefore crucial. For testing the significance of biological signal, multiplicity adjustment is indispensable. We propose a modeling approach that allows flexible experimental design, while providing accurate estimation and easy multiplicity-adjusted inferences. This modeling approach is suitable for various types of microarrays, including both cDNA and oligonucleotide microarrays. The statistical modeling and multiplicity-adjusted inference are integrated into an R package, MultiArray , as a computationally efficient environment. Real microarray experiment examples show that our modeling approach and MultiArray outperform other popular packages in both detecting differences and establishing equivalence in gene expressions.
Committee: Jason Hsu (Advisor)
Subjects: Statistics