Doctor of Philosophy, The Ohio State University, 2019, Computer Science and Engineering
Recent advancements in the field of computational sciences and high-performance computing have enabled scientists to design high-resolution computational models to simulate various real-world physical phenomenon. In order to gain key scientific insights about the underlying phenomena it is important to analyze and visualize the output data produced by such simulations. However, large-scale scientific simulations often produce output data whose size can range from a few hundred gigabytes to the scale of terabytes or even petabytes. Analyzing and visualizing such large-scale simulation data is not trivial. Moreover, scientific datasets are often multifaceted (multivariate, multi-run, multi-resolution, etc.), which can introduce additional complexities to the analyses and visualization activities.
This dissertation addresses three broad categories of data analysis and visualization challenges: (i) multivariate distribution-based data summarization, (ii) uncertain analysis in ensemble simulation data, and (iii) simulation parameter analysis and exploration. We proposed statistical and machine learning-based approaches to overcome these challenges.
A common strategy to deal with large-scale simulation data is to partition the simulation domain and create data summaries in the form of statistical probability distributions. Instead of storing high-resolution raw data, storing the compact statistical data summaries results in reduced storage overhead and alleviated I/O bottleneck issues. However, for multivariate simulation data using standard multivariate distributions for creating data summaries is not feasible. Therefore, we proposed a flexible copula-based multivariate distribution modeling strategy to create multivariate data summaries during simulation execution time (i.e, in-situ data modeling). The resulting data summaries can be subsequently used to perform scalable post-hoc analysis and visualization.
In many cases, scientists execute their simulations mu (open full item for complete abstract)
Committee: Han-Wei Shen (Advisor); Rephael Wenger (Committee Member); Yusu Wang (Committee Member)
Subjects: Computer Science; Statistics