Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication

Gulhane, Radha

Keyword Search

School Logo

MS_Thesis_Radha_Gulhane.pdf (2.05 MB)

Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication

Author Info

Gulhane, Radha

ORCID® Identifier

http://orcid.org/0009-0009-2591-1082

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=osu1713381834648517

Year and Degree

2024, Master of Science, Ohio State University, Computer Science and Engineering.

Abstract

In recent years, there has been significant research and development in Deep Learning (DL) due to its efficiency and extensive applicability across diverse domains, including Computer Vision and Large Language Models. However, the architecture of large Deep Learning models, containing dense layers, makes them compute and memory intensive. Distributed Deep Learning (Distributed DL) is the successful adaption to accelerate and enable training and inference for large-scale DL models, where it also deals with various parallel approaches, inference and training techniques, and communication optimization strategies to enhance performance. In this thesis, we focus on accelerated and memory-efficient techniques to optimize distributed training and inference. It is broadly categorized into three different approaches: 1. Inference for scaled images using quantization, achieving a speedup of 6.5x with integer- only quantization and 1.58x with half-precision, with less than 1% accuracy degradation. 2. MPI4DL: Distributed Deep Learning Parallelism framework encompassing various parallelism techniques with integral components such as Spatial Parallelism, Bidirectional Parallelism, and Hybrid Parallelism 3. Communication optimization by leveraging MCR- DL: A distributed module for DL frameworks with support for mixed-backend communication, dynamic selection of the optimal backend, and communication optimization enhancements such as compression and tensor fusion.

Committee

Prof. Dhabaleswar K. Panda (Advisor)
Dr. Aamir Shafi (Committee Member)
Prof. Hari Subramoni (Committee Member)

Pages

63 p.

Subject Headings

Computer Science

Keywords

Deep Learning, Distributed Deep Learning, Quantization, Inference, Parallelism Techniques, Communication Optimization

Gulhane, R. (2024). Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication [Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1713381834648517
APA Style (7th edition)
Gulhane, Radha. Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication. 2024. Ohio State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1713381834648517.
MLA Style (8th edition)
Gulhane, Radha. "Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication." Master's thesis, Ohio State University, 2024. http://rave.ohiolink.edu/etdc/view?acc_num=osu1713381834648517
Chicago Manual of Style (17th edition)

Document number:

osu1713381834648517

Download Count:

212

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

Accelerated and Memory-Efficient Distributed Deep Learning: Leveraging Quantization, Parallelism Techniques, and Mix-Match Runtime Communication

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Recommended Citations