Parallel Algorithms for Machine Learning

Moon, Gordon Euhyun

Keyword Search

School Logo

Dissertation.pdf (3.16 MB)

Parallel Algorithms for Machine Learning

Author Info

Moon, Gordon Euhyun

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1561980674706558

Year and Degree

2019, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.

Abstract

Machine learning is becoming an integral part of everyday life. Therefore, development of a high performance genre of machine learning algorithms is becoming increasingly significant from the perspectives of performance, efficiency, and optimization. The current solution is to use machine learning frameworks such as TensorFlow, PyTorch and CNTK, which enable us to utilize specialized architectures such as multi-core CPUs, GPUs, TPUs and FPGAs. However, many machine learning frameworks facilitate high productivity, but are not designed for high performance. There is a significant gap in the performance achievable by these frameworks and the peak compute capability of the current architectures. In order for machine learning algorithms to be accelerated for large-scale data, it is essential to develop architecture-aware machine learning algorithms. Since many machine learning algorithms are very computationally demanding, parallelization has garnered considerable interest. In order to achieve high performance, data locality optimization is extremely critical, since the cost of data movement from memory is significantly higher than the cost of performing arithmetic/logic operations on current processors. However, the design and implementation of new algorithms in machine learning has been largely driven by a focus on computational complexity. In this dissertation, the parallelization of three extensively used machine learning algorithms, Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and Word2Vec, is addressed by a focus on minimizing the data movement overhead through the memory hierarchy, using techniques such as 2D-tiling and rearrangement of data computation. While developing each parallel algorithm, a systematic analysis of data access patterns and data movements of the algorithm is performed and suitable algorithmic adaptations and parallelization strategies are developed for both multi-core CPU and GPU platforms. Experimental results of the large-scale datasets demonstrate that our new parallel algorithms achieved a significant reduction of data movement from/to main memory and improved performance over existing state-of-the-art algorithms.

Committee

P. Sadayappan (Advisor)
Srinivasan Parthasarathy (Committee Member)
Eric Fosler-Lussier (Committee Member)

Pages

159 p.

Subject Headings

Computer Science

Keywords

Parallel Machine Learning; Parallel Topic Modeling; Parallel Latent Dirichlet Allocation; Parallel Non-negative Matrix Factorization; Parallel Word2Vec; Dimension Reduction; Word Embedding

Moon, G. E. (2019). Parallel Algorithms for Machine Learning [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1561980674706558
APA Style (7th edition)
Moon, Gordon. Parallel Algorithms for Machine Learning. 2019. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1561980674706558.
MLA Style (8th edition)
Moon, Gordon. "Parallel Algorithms for Machine Learning." Doctoral dissertation, Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1561980674706558
Chicago Manual of Style (17th edition)

Document number:

osu1561980674706558

Download Count:

700

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Parallel Algorithms for Machine Learning

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Parallel Algorithms for Machine Learning

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations