Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
dissertation.pdf (24.73 MB)
ETD Abstract Container
Abstract Header
In Situ Summarization and Visual Exploration of Large-scale Simulation Data Sets
Author Info
Dutta, Soumya
ORCID® Identifier
http://orcid.org/0000-0001-5030-9979
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1524070976058567
Abstract Details
Year and Degree
2018, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Abstract
Recent advancements in the computing power have enabled the application scientists to design their simulation study using very high-resolution computational models. The output data from such simulations provide a plethora of information that need to be explored for enhanced understanding of the underlying phenomena. Large-scale simulations, nowadays, produce multivariate, time-varying data sets in the order of petabytes and beyond. Traditional post-processing based analysis utilizing raw data cannot be readily applicable, since storing all the data is becoming prohibitively expensive. This is because of the bottleneck stemming from output data size and I/O compared to the ever-increasing computing speed. Hence, exploration and visualization of such extreme-scale simulation outputs are posing significant challenges. This dissertation addresses the aforementioned issues and suggests an alternative pathway by enabling in situ analysis, i.e., in-place analysis of data, while it still resides in supercomputer memory. We embrace the in situ technology and adopt simulation time data analysis, triage, and summarization using various data transformation techniques. The proposed methods process data as the simulation generates it and employ different analysis techniques to extract important data properties efficiently. However, the amount of work that can be done in situ is often limited in terms of time and storage since overburdening the simulation with additional computation is undesired. Furthermore, while some application domain driven analyses fit well for an in situ environment, a wide range of visual-analytics tasks require longer time involving iterative exploration during post-processing. Therefore, to this end, we conduct in situ statistical data summarization in the form of compact probability distribution functions, which preserve essential statistical data properties and facilitate flexible and scalable post-hoc exploration. We show that the reduced statistical data summaries can work as a replacement for the raw data and are able to perform important tasks such as feature detection, extraction, and tracking. To study the prospect of the proposed data summaries, in one application, we demonstrate that by using the statistical data summaries, complex features such as vortices, the eye of a hurricane can be extracted and tracked over time. In another scenario, we validate that, by employing statistical anomaly-based analysis using the summary data, we can detect the development of flow instabilities in a high-resolution computational fluid dynamics (CFD) simulation. We demonstrate that using the statistical distribution-based data summaries, various information-theoretic measures can be estimated which enhance multivariate time-varying feature exploration by quantifying the importance of scalar values and scalar value combinations. Furthermore, to reduce the workload of post-hoc analysis while studying the parameter sensitivity of high-resolution simulations using various input parameter settings, we devise techniques for in situ feature classification which significantly reduces the memory footprint in the post-hoc analysis. The proposed technique learns the feature-specific properties using a fuzzy rule-based algorithm in an initial off-line analysis stage using a known simulation data and performs feature classification in situ for other unknown simulation runs. For each unknown simulation, we store only a small amount of feature-specific summarized data which is visualized, compared, and analyzed interactively in the post-hoc exploration.
Committee
Han-Wei Shen (Advisor)
Pages
217 p.
Subject Headings
Computer Engineering
;
Computer Science
Keywords
In situ analysis
;
big data visualization
;
time-varying data analysis
;
feature analysis and tracking
;
probabilistic data modeling and summarization
;
high performance computing
;
visual data exploration
;
computer graphics
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Dutta, S. (2018).
In Situ Summarization and Visual Exploration of Large-scale Simulation Data Sets
[Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524070976058567
APA Style (7th edition)
Dutta, Soumya.
In Situ Summarization and Visual Exploration of Large-scale Simulation Data Sets.
2018. Ohio State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1524070976058567.
MLA Style (8th edition)
Dutta, Soumya. "In Situ Summarization and Visual Exploration of Large-scale Simulation Data Sets." Doctoral dissertation, Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524070976058567
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1524070976058567
Download Count:
485
Copyright Info
© 2018, some rights reserved.
In Situ Summarization and Visual Exploration of Large-scale Simulation Data Sets by Soumya Dutta is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by The Ohio State University and OhioLINK.