Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering
The rate of improvement in data access costs continues to lag behind
the improvement in computational rates. Therefore characterization and
enhancement of data locality in applications is extremely
important. In addition, load balancing also plays a significant role in
parallel application performance.
This is particularly challenging for irregular and
unstructured applications. In this dissertation, we address both the
efficient parallel characterization of data locality characteristics
of programs, as well as develop parallel applications with enhanced
data locality and load balancing.
First, we address the speed of reuse distance analysis by
parallelization. Reuse distance can directly predict the cache hit
ratio for a fully associative cache and be used in various program
optimization techniques like loop tiling, code reordering, cache
sharing and cache partitioning to improve locality. Though reuse
distance analysis is very useful, it is also costly. We develop the
first parallel reuse distance analysis algorithm (Parda). Parda
achieves speedup from 13 to 50 on various SPEC CPU2006 benchmarks
compared to state-of-art sequential accurate reuse distance analysis
algorithm.
Second, we utilize reuse distance analysis to construct a locality
based performance model to analyze and enhance the performance of two
production scientific applications QMCPACK and QWalk. These quantum
Monte Carlo (QMC) applications use a very large read-only table to store
spline interpolation coefficients, and accesses to the table are
generated at random based on the state of the Monte Carlo
simulation. Currently QMC applications such as QWalk and QMCPACK
replicate this table for every process or node, which limits
scalability because increasing the number of processors does not
enable larger systems to be run. We present a partitioned global
address space (PGAS) approach to transparently managing this data
using Global Arrays in a manner that allows (open full item for complete abstract)
Committee: P Sadayappan Dr (Advisor); Srinivasan Parthasarathy Dr (Committee Member); Rountev Atanas Dr (Committee Member)
Subjects: Computer Science