Search ETDs:
Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures
Steinfadt, Shannon Irene

2010, PHD, Kent State University, College of Arts and Sciences / Department of Computer Science.

This research addresses one of the most often used tools in bioinformatics, sequence alignment. The increasing growth and complexity of high-performance computing as well as the stellar data growth in the bioinformatics field are motivating factors. An associative algorithm for performing quality sequence alignments more efficiently and faster is at the center of the dissertation. SWAMP or Smith-Waterman using Associative Massive Parallelism is the parallel algorithm designed and implemented for the ASC associative SIMD computing model. The theoretical parallel speedup for the algorithm is optimal, and reduces the compute time from O(mn) to O(m+n), where m and n are the length of the input sequences. When m = n, the running time becomes O(n) with a very small constant.

Using the capabilities of ASC, innovative new algorithms that are extensions of the above SWAMP algorithm increase the information returned by the alignment algorithms without decreasing the accuracy of those alignments. Known as SWAMP+, these algorithms that provide a highly sensitive parallelized approach extending traditional pairwise sequence alignment. They are useful for in-depth exploration of sequences, including research in expressed sequence tags, regulatory regions, and evolutionary relationships.


Although the SWAMP suite of algorithms was designed for the associative computing platform, they were implemented on the ClearSpeed CSX 620 parallel SIMD accelerator to obtain realistic metrics. The performance for the compute intensive matrix calculation displayed a speedup of roughly 96 using ClearSpeed's 96 processing elements, thus verifying the possibility of achieving the theoretical speedup mentioned above.

Additional parallel hardware implementations were explored and a cluster-based approach to test the memory-intensive Smith-Waterman across multiple nodes within a cluster was used. This work utilized a tool called JumboMem. It allowed us to store the huge matrix of computations completely in memory for what we believe to be one of the largest instances of Smith-Waterman ever run.


Johnnie W. Baker, PhD (Advisor)
Kenneth Batcher, PhD (Committee Member)
Paul Farrell, PhD (Committee Member)
James Blank, PhD (Committee Member)
134 p.

Recommended Citations

Hide/Show APA Citation

Steinfadt, S. (2010). Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures. (Electronic Thesis or Dissertation). Retrieved from https://etd.ohiolink.edu/

Hide/Show MLA Citation

Steinfadt, Shannon. "Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures." Electronic Thesis or Dissertation. Kent State University, 2010. OhioLINK Electronic Theses and Dissertations Center. 30 May 2015.

Hide/Show Chicago Citation

Steinfadt, Shannon "Smith-Waterman Sequence Alignment For Massively Parallel High-Performance Computing Architectures." Electronic Thesis or Dissertation. Kent State University, 2010. https://etd.ohiolink.edu/

Files

kent1271656353.pdf (3.51 MB) View|Download