Search Results (1 - 1 of 1 Results)

Sort By  
Sort Dir
Results per page  

Moler, James C.Optimizing Approaches for Sensitive, High Performance Clustering of Gene Expressions
Master of Science, Miami University, 2011, Computer Science and Systems Analysis
This thesis presents several new algorithmic approaches to the problem of clustering conventional ESTs and high throughput gene expression data, which are implemented in the software tool PEACE. The d2 algorithm for sequence comparison is improved and enhanced with a novel two-pass extension, and a minimum spanning tree-based algorithm is used to cluster ESTs, providing an efficient and accurate solution. Furthermore, in order to address the unique challenges of high throughput sequencing technologies such as 454, Illumina and SOLiD sequencing, an adaptive d2 algorithm is introduced to handle variations in fragment length. The resulting tool compares favorably with other leading tools in the literature, including WCD, CAP3, and TGICL, on both EST and next-generation sequencing (NGS) data.


John Karro, PhD (Advisor); Dhananjai Rao, PhD (Committee Member); Mufit Ozden, PhD (Committee Member)


Bioinformatics; Computer Science


clustering gene expressions; EST