Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 28)

Mini-Tools

 
 

Search Report

  • 1. Thapa, Badal Reliability Analysis of Linear Dynamic Systems by Importance Sampling-Separable Monte Carlo Technique

    Master of Science, University of Toledo, 2020, Mechanical Engineering

    For many problems, especially nonlinear systems, the reliability assessment must be done in the time domain. Monte-Carlo simulation (MCS) can accurately assess the reliability of the system. However, its computational cost is highly expensive for the complex dynamic system. Importance Sampling (IS) method is a more efficient method than standard MCS for the reliability assessment of a system. It has been applied to dynamic systems when the excitation is defined by a Power Spectral Density (PSD) function. The central idea of the IS method is about generating sample time histories using a sampling PSD and introducing the likelihood ratio to each replication to give the unbiased estimator of the probability of failure. Another more efficient method than MCS for the reliability assessment of the dynamic system is the Separable Monte-Carlo (SMC) method. However, this method has been applied to linear dynamic systems as following. It starts with the step of drawing frequencies from PSD of excitation, calculation of system responses to each frequency, and storing them in a database. Then the stored frequencies and the respective responses are chosen randomly with the replacement for each replication to find the system response to the linear combination of the respective sinusoidal functions. Therefore, SMC can assess the reliability of the system with a proper database. The size of the database would depend on the shape of the PSD function and the complexity of the system. This research proposed a new method by combining IS with SMC to assess the reliability of linear dynamic systems. In this method, the database of the proposed method formed by using a sampling PSD is used to estimate the reliability of the system for the true spectrum The proposed method is more efficient than both IS or SMC methods individually in terms of both computational time and accuracy. The proposed method is demonstrated using a 10-bar truss.

    Committee: Mohammad Elahinia (Committee Chair); Mahdi Norouzi (Committee Co-Chair); Shawn P. Capser (Committee Member) Subjects: Mechanical Engineering
  • 2. Duan, Li Bayesian Nonparametric Methods with Applications in Longitudinal, Heterogeneous and Spatiotemporal Data

    PhD, University of Cincinnati, 2015, Arts and Sciences: Mathematics (Statistics)

    Nonparametric methods provide flexible accommodation to the different structures in the data without imposing strong assumptions. Bayesian consideration of nonparametric models, such as Gaussian process, Dirichlet process and decision tree enables straightforward computation and automatic regularization. In this dissertation, we developed three novel nonparametric methods for handling different types of data. For the longitudinal and time-to-event data, we utilized the multiple subject composition and repeated measurement, and designed a hierarchical Gaussian process that enables extrapolation for personalized forecast. For the regression of heterogeneous data, we combined the clustering properties of the Dirichlet process and the nonlinear incorporation of predictors in decision tree, and developed an efficient method for handling heterogeneity and ensemble estimation. For the spatiotemporal data, we first designed a performant algorithm for stationary Gaussian process, and then extended it to allow non-stationarity and non-Gaussianness of the complex data. We demonstrate the advantages of the Bayesian modeling in latent variable sampling, missing data handling, algorithm acceleration, accurate prediction and probabilistic interpretation.

    Committee: Xia Wang Ph.D. (Committee Chair); Emily Kang Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member); Seongho Song Ph.D. (Committee Member); Rhonda VanDyke Ph.D. (Committee Member) Subjects: Statistics
  • 3. Erich, Roger Regression Modeling of Time to Event Data Using the Ornstein-Uhlenbeck Process

    Doctor of Philosophy, The Ohio State University, 2012, Biostatistics

    In this research, we develop innovative regression models for survival analysis that model time to event data using a latent health process which stabilizes around an equilibrium point; a characteristic often observed in biological systems. Regression modeling in survival analysis is typically accomplished using Cox regression, which requires the assumption of proportional hazards. An alternative model, which does not require proportional hazards, is the First Hitting Time (FHT) model where a subject's health is modeled using a latent stochastic process. In this modeling framework, an event occurs once the process hits a predetermined boundary. The parameters of the process are related to covariates through generalized link functions thereby providing regression coefficients with clinically meaningful interpretations. In this dissertation, we present an FHT model based on the Ornstein-Uhlenbeck (OU) process; a modified Wiener process which drifts from the starting value of the process toward a state of equilibrium or homeostasis present in many biological applications. We extend previous OU process models to allow the process to change according to covariate values. We also discuss extensions of our methodology to include random effects accounting for unmeasured covariates. In addition, we present a mixture model with a cure rate using the OU process to model the latent health status of those subjects susceptible to experiencing the event under study. We apply these methods to survival data collected on melanoma patients and to another survival data set pertaining to carcinoma of the oropharynx.

    Committee: Michael Pennell PhD (Advisor); Thomas Santner PhD (Committee Member); Dennis Pearl PhD (Committee Member) Subjects: Biostatistics; Statistics
  • 4. Li, Haoyu Efficient Visualization for Machine-Learning-Represented Scientific Data

    Doctor of Philosophy, The Ohio State University, 2024, Computer Science and Engineering

    Recent progress in high-performance computing now allows researchers to run extremely high-resolution computational models, simulating detailed physical phenomena. Yet, efficiently analyzing and visualizing the extensive data from these simulations is challenging. Adopting machine learning models to reduce the storage cost of or extract salient features from large scientific data has proven to be a successful approach to analyzing and visualizing these datasets effectively. Machine learning (ML) models like neural networks and Gaussian process models are powerful tools in data representation. They can capture the internal structures or ``features'' from the dataset, which is useful in compressing the data or exploring the subset of data that is of interest. However, applying machine learning models to scientific data brings new challenges to visualization. Machine learning models are usually computationally expensive. Neural networks are expensive to reconstruct on a dense grid representing a high-resolution scalar field and Gaussian processes are notorious for their cubic time complexity to the number of data points. If we consider other variables in the data modeling, for example, the time dimension and the simulation parameters in the ensemble data, the curse of dimensionality will make the computation cost even higher. The long inference time for the machine learning models puts us in a dilemma between the high data storage cost of the original data representation and the high computation cost of the machine learning representation. The above challenges demonstrate a great need for techniques and algorithms that increase the speed of ML model inference. Despite many generic efforts to increase ML efficiency, for example, using better hardware acceleration or designing more efficient architecture, we tackle a more specific problem of how to query the ML model more efficiently with a specific scientific visualization task. In this dissertation, we c (open full item for complete abstract)

    Committee: Han-Wei Shen (Advisor); Hanqi Guo (Committee Member); Raphael Wenger (Committee Member) Subjects: Computer Engineering; Computer Science
  • 5. Li, Youjun Semiparametric and Nonparametric Model Approaches to Causal Mediation Analysis for Longitudinal Data

    Doctor of Philosophy, Case Western Reserve University, 2024, Epidemiology and Biostatistics

    There has been a lack of causal mediation analysis methods developed for complex longitudinal data. Most existing work focuses on extensions of parametric models that have been well developed for causal mediation analysis for cross-sectional data. To better handle complex, including irregular, longitudinal data, our approach takes advantage of the flexibility of penalized splines and performs causal mediation analysis under the structural equation model framework. The incorporation of penalized splines allows us to deal with repeated measures of the mediator and the outcome that are not all recorded at the same time points. The penalization avoids otherwise difficult choices in selecting knots and prevents the splines from overfitting so that the prediction for future time points will be more reasonable. We also provide the formula for identifying the natural direct and indirect effects based on our semiparametric models, whose inference is carried out by delta method and Monte Carlo approximation. This frequentist approach can be straightforward and efficient when implemented under the linear mixed model (LMM) framework, but it sometimes faces convergence problems as the random effects components introduce complications when using the commonly seen optimization algorithms in most of the statistical software. Although Bayesian modeling under LMM is less likely to face the convergence problem with the help of Markov chain Monte Carlo (MCMC) sampling, it can be computationally expensive compared to the frequentist approach due to the nature of the MCMC algorithm. As an alternative Bayesian approach, Gaussian process regression (GPR) also has the flexibility to fit various data patterns and will be more efficient than Bayesian modeling using MCMC, as the posterior distribution in GPR is a known form from which the posterior samples can be directly drawn. We thus attempt to extend the standard GPR approach to allow multiple covariates of both continuous and categorical (open full item for complete abstract)

    Committee: Pingfu Fu (Committee Chair); David Aron (Committee Member); Mark Schluchter (Committee Member); Jeffrey Albert (Advisor) Subjects: Biostatistics; Statistics
  • 6. Flory, Joseph Practical Methods for Bayesian Optimization with Input-Dependent Noise

    Master of Science, The Ohio State University, 2023, Chemical Engineering

    Decision making and optimization are core aspects of many real-world engineering problems, ranging from process optimization to experimental design. Many of these systems are black-box and expensive to evaluate which causes many traditional optimization methods to be difficult to implement in these systems. Additionally, many systems experience heteroskedastic noise when collecting data which many optimization strategies can not account for. Bayesian optimization has successfully been able to use surrogate models to create an easier to optimize system which can be updated by introducing new samples. Bayesian optimization requires the use of surrogate models, most commonly Gaussian processes, to model the sampled data, and acquisition functions to find optimal locations for sampling new points. Traditional Gaussian processes have been unable to heteroskedastic noise in data which led to the development of the heteroskedastic Gaussian processes (HGP). These HGPs are capable of properly accounting for noise in the data and can make more accurate predictions on regions without samples. Acquisition functions however have difficulty handling noise, and the most capable of handling this noise, knowledge gradient, is difficult to optimize and evaluate. This thesis focuses on a new method for implementing knowledge gradient and using knowledge gradient for enhanced decision making. In order to ensure global optimality of the knowledge gradient function, grid based methods generally must be implemented which are inefficient and lead to gaps in the sampling space. A new method, neural network knowledge gradient (NNKG), uses randomly generated initial sampling data to more efficiently explore the sample space and interpolate between samples. This method when compared to the traditional method also allows for enhanced visualization of the knowledge gradient surface which allows for greater understanding in regions of value and enhanced decision making on where to sample next (open full item for complete abstract)

    Committee: Bhavik Bakshi (Committee Member); Joel Paulson (Advisor) Subjects: Chemical Engineering
  • 7. Yang, Gang Emulators and Uncertainty Quantification for High Dimensional Complex Models with Applications in Remote Sensing

    PhD, University of Cincinnati, 2022, Arts and Sciences: Mathematical Sciences

    The retrieval algorithms in remote sensing generally involve complex physical forward models that are nonlinear and computationally expensive to evaluate. Statistical emulation provides an alternative with cheap computation and can be used to quantify uncertainty, calibrate model parameters and improve computational efficiency of the retrieval algorithms. Motivated by this, in this thesis, we introduce a framework for building statistical emulators by combining dimension reduction of input and output spaces and Gaussian process modeling. The functional principal component analysis (FPCA) via a conditional expectation method is chosen to reduce the dimension of the output space of the forward model. In addition, the gradient-based kernel dimension reduction (gKDR) method is applied to reduce the dimension of input space when the gradients of the complex forward model are unavailable or computationally prohibitive. A Gaussian process with feasible computation is then constructed at the low-dimensional input and output spaces. Theoretical properties of the resulting statistical emulator are explored, and the proposed method is illustrated by application to NASA's Orbiting Carbon Observatory-2 (OCO-2) data. Though the Gaussian process emulator provides accurate prediction, it loses computational efficiency when the dataset used to train the emulator is large and/or the inputs and outputs of the model are high-dimensional. Dimension reduction in input is often required. In satellite remote sensing, the quantity of interest (QOI) (e.g., the atmospheric state) is inferred from observable radiance spectra. OCO-2's primary QOI is the column averaged dry air mole fraction of CO2 which are key state variables included in the input state vector. Lowering the dimension of input when building emulator can lead to a loss of information and interpretability, and cause negative effects when estimating the atmospheric CO2 in OCO-2 application. To avoid dimension reduction in i (open full item for complete abstract)

    Committee: Emily Kang Ph.D. (Committee Member); Seongho Song Ph.D. (Committee Member); Bledar Konomi Ph.D. (Committee Member); Won Chang Ph.D. (Committee Member) Subjects: Statistics
  • 8. Cheng, Si Hierarchical Nearest Neighbor Co-kriging Gaussian Process For Large And Multi-Fidelity Spatial Dataset

    PhD, University of Cincinnati, 2021, Arts and Sciences: Mathematical Sciences

    Spatial datasets with varying fidelity are often obtained by different platform in remote sensing. A single composite feature that includes adequate information from multiple data sources is preferred for the statistical inference. Modeling multi-fidelity dataset usually encounters the challenges of computational complexity and complicated correlation structure due to large amount of observations for spatial areas that may or may not overlap or have same spatial footprints. Few of previous researches provide the comprehensive solution for these challenges. This dissertation develops a hierarchical nearest neighbor co-kriging Gaussian process(NNCGP) for the analysis of large irregularly spaced and multi-fidelity spatial dataset. The dissertation also proposed efficient algorithms of NNCGP method for improved performance and further computational efficiency. Simulation studies demonstrate the proposed NNCGP models are capable of providing more reliable statistical inference, improved prediction performance and reduced amount of running time comparing to the classical models. The methods are also applied to high-resolution infrared radiation sounder (HIRS) data-sets gathered daily from two polar orbiting satellite series (POES) of the National Oceanic and Atmospheric Administration (NOAA).

    Committee: Bledar Konomi Ph.D. (Committee Chair); Won Chang Ph.D. (Committee Member); Emily Kang Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member) Subjects: Statistics
  • 9. Su, Weiji Flexible Joint Hierarchical Gaussian Process Model for Longitudinal and Recurrent Event Data

    PhD, University of Cincinnati, 2020, Arts and Sciences: Mathematical Sciences

    Jointly modeling two types of longitudinal markers makes optimal use of the available information and serves to investigate the joint evolution of the two processes, to examine the underlying association and to evaluate the surrogate markers simultaneously. In this dissertation, we develop a series of joint models for the longitudinal repeated measurements, including continuous, repeated binary, and time to recurrent event data. Our goal is to extend the joint model with more flexibility via parametric, semi-parametric and nonparametric methods to capture various features in the data. A hierarchical Gaussian process is incorporated into the proposed joint model framework to explain the characteristics of both population and personalized variation and provide dynamic predictions. In analyzing the longitudinal continuous and repeated binary data, we incorporate a family of parametric link functions into the proposed joint model to obtain flexibility in handling skewness in the probability response curves. In jointly modeling the longitudinal and recurrent time-to-event data, we utilize both semi-parametric and non-parametric methods to monitor the non-linearity in population evolution and heterogeneity. Furthermore, we exhibit the application of the proposed joint model in examining the impact of various risk factors. We employ Bayesian approaches in model construction and estimation. The proposed models are compared with existing joint modeling approaches. Particularly, we incorporate the idea of likelihood decomposition and develop the model comparison criterion to facilitate performance assessment of each submodel separately. We carry out extensive simulation studies in each of the models we proposed. The purpose of the simulation studies is to show the properties, implementation, performance as well as potential problems of the proposed models compared with existing methods. Joint modeling is of particular importance in clinical studies. Our real data (open full item for complete abstract)

    Committee: Xia Wang Ph.D. (Committee Chair); Xuan Cao Ph.D. (Committee Member); Won Chang Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member); Rhonda Szczesniak Ph.D. (Committee Member) Subjects: Statistics
  • 10. Shende, Sourabh Bayesian Topology Optimization for Efficient Design of Origami Folding Structures

    MS, University of Cincinnati, 2020, Engineering and Applied Science: Mechanical Engineering

    Bayesian optimization (BO) is a popular method for solving optimization problems involving expensive objective functions. Although BO has been applied across various fields, its use in structural optimization area is in its early stages. Origami folding structures provide a complex design space where the use of an efficient optimizer is critical. In this research work for the first time the ability of BO to solve origami-inspired design problems is demonstrated. A Gaussian process (GP) is used as the surrogate model that is trained to mimic the response of the expensive finite element (FE) objective function. The ability of this BO-FE framework to find optimal designs is verified by applying it to two well known origami design problems: chomper and twist chomper. The performance of the proposed approach is compared to traditional gradient-based optimization techniques and genetic algorithm methods in terms of ability to discover designs, computational efficiency and robustness. BO has many user-defined components and parameters, and intuitions for these for structural optimization are currently limited. In this work, the role of hyperparameter tuning and the sensitivity of Bayesian optimization to the quality and size of the initial training set is studied. Taking a holistic view of the computational expense, various heuristic approaches are proposed to reduce the overall cost of optimization. A methodology to include derivative information of the objective function in the formulation of the GP surrogate is described, and its advantages and disadvantages are discussed. Additionally, an anisotropic GP surrogate model with independent length scales for each design variable is studied. A procedure to reduce the overall dimension of the problem using information from anisotropic models is proposed. The results show that Bayesian optimization is an efficient and robust alternative to traditional methods. It allows for the discovery of optimal designs using f (open full item for complete abstract)

    Committee: Kumar Vemaganti Ph.D. (Committee Chair); Philip Buskohl Ph.D. (Committee Member); Manish Kumar Ph.D. (Committee Member) Subjects: Mechanical Engineering
  • 11. Anil, Vijay Sankar Mission-based Design Space Exploration and Traffic-in-the-Loop Simulation for a Range-Extended Plug-in Hybrid Delivery Vehicle

    Master of Science, The Ohio State University, 2020, Mechanical Engineering

    With the on-going electrification and data-intelligence trends in logistics industries, enabled by the advances in powertrain electrification, and connected and autonomous vehicle technologies, the traditional ways vehicles are designed by engineering experience and sales data are to be updated with a design for operation notion that relies intensively on operational data collection and large scale simulations. In this work, this design for operation notion is revisited with a specific combination of optimization and control techniques that promises accurate results with relatively fast computational time. The specific application that is explored here is a Class 6 pick-up and delivery truck that is limited to a given driving mission. A Gaussian Process (GP) based statistical learning approach is used to refine the search for the most accurate, optimal designs. Five hybrid powertrain architectures are explored, and a set of Pareto-optimal designs are found for a specific driving mission that represents the variations in a hypothetical operational scenario. A cross-architecture performance and cost comparison is performed and the selected architecture is developed further in the form of a forward simulator with a dedicated ECMS controller. In the end, a traffic-in-the-loop simulation is performed by integrating the selected powertrain architecture with a SUMO traffic simulator to evaluate the performance of the developed controller against varying driving conditions.

    Committee: Giorgio Rizzoni (Advisor); Qadeer Ahmed (Committee Member) Subjects: Automotive Engineering; Engineering; Mechanical Engineering; Sustainability; Systems Design; Transportation
  • 12. Ricciardi, Denielle Uncertainty Quantification and Propagation in Materials Modeling Using a Bayesian Inferential Framework

    Doctor of Philosophy, The Ohio State University, 2020, Materials Science and Engineering

    In the past several decades, there has been an unprecedented demand for the discovery and design of new materials to support rapidly advancing technology. This demand has fueled a push for Integrated Computational Materials Engineering (ICME), an engineering approach whereby model linkages as well as experimental and computational integration are exploited in order to efficiently explore materials processing-to-performance relationships. Tailored simulations allow for the reduction of expensive and lengthy experiments, emphasizing the need to establish a statistical confidence in component designs and manufacturing processes from the simulations, rather than experiments, in a principled way. Since many materials models and simulations are deterministic in nature, the use of sophisticated tools and techniques are required. Achieving a statistical confidence in a simulation output requires, first, the identification of the various sources of error and uncertainty affecting the simulation results. These sources include machine and user error in collecting calibration data, uncertain model parameters, random error from natural processes, and model inadequacy in capturing the true material property or behavior. Statistical inference can then be used to recover information about unknown model parameters by conditioning on available data while taking into account the various sources of uncertainty. In this work, Bayesian inference is used to quantify and propagate uncertainty in simulations of material behavior. More specifically, a random effects hierarchical framework is used since it provides a way to account for uncertainty stemming from random natural processes or conditions. This is especially important in many materials modeling applications where the random microstructure plays an important role in dictating material behavior. In addition to this, in many cases experiments are quite costly, so in order to obtain sufficient data for calibration, a compilation (open full item for complete abstract)

    Committee: Stephen Niezgoda (Advisor); Oksana Chkrebtii (Committee Member); Yunzhi Wang (Committee Member); Alan Luo (Committee Member) Subjects: Materials Science; Statistics
  • 13. Melendez, Jordan Effective Field Theory Truncation Errors and Why They Matter

    Doctor of Philosophy, The Ohio State University, 2020, Physics

    All models are wrong, but some are useful. Still fewer are wrong in a way that is useful. This thesis shows that effective field theories (EFTs) are not only powerful predictive tools, but also contain the ingredients necessary to estimate their own imperfection. By developing a Bayesian model of EFT discrepancy, we provide the first statistically rigorous accounting of chiral EFT uncertainties in nuclear systems. These physics-based uncertainty estimates are shown to reduce bias in the extraction of nuclear parameters known as low-energy constants, and even to predict regimes where the EFT falls apart completely: its breakdown scale. This formalism is shown to be applicable in a wide range of circumstances; it can provide uncertainties for neutrinoless double beta decay, model the interior of neutron stars, uncover the constraints on chiral 3-body forces, extract scattering information from bound states via an improved Busch formula, and can even provide guidance on the intelligent design of experiments informed by EFTs.

    Committee: Dick Furnstahl (Advisor) Subjects: Physics
  • 14. Kacker, Shubhra The Role of Constitutive Model in Traumatic Brain Injury Prediction

    MS, University of Cincinnati, 2019, Engineering and Applied Science: Mechanical Engineering

    Traumatic brain injury is a major cause of fatalities in the United Sates. To understand the correct mechanics and develop protective gear, computational methods such as finite element analysis of human head have been used extensively in the past. In modeling such a complex biomechanical phenomenon, constitutive equations describing the material behavior are used. Due to inevitable variability in soft tissue experimental data, uncertainty in the material parameters are observed. To model this variability accurately, the uncertainty in the material parameters needs to be propagated to quantities of interest in the simulation, such as injury criteria. This in turn requires the proper selection of the constitutive models. This work gives an insight into the role of these constitutive models in traumatic brain injury prediction. In this thesis, a Bayesian framework is used for the estimation of material model parameters based on the nested sampling algorithm MULTINEST. A non-linear visco-hyperelastic material model is considered for brain tissue and is implemented in the finite element software LS-DYNA. Various hyperelastic models are considered to understand the role of these models on traumatic brain injury prediction. Finite element analysis of the SIMon human head model is performed to simulate an impact loading causing traumatic brain injury and a maximum principal strain-based injury criterion is considered to quantify the severity of the injury. To non-intrusively propagate the uncertainty in the material parameters, a Gaussian process surrogate model is used in order to avoid high computational expense. Based on this, the distribution of the injury criterion is obtained for all the material models considered. The results obtained show that the constitutive model plays a major role in propagating the uncertainty to the injury criterion. Some material models are very sensitive to the material parameters compared to others. The analyst should be extra cautious w (open full item for complete abstract)

    Committee: Kumar Vemaganti Ph.D. (Committee Chair); Woo Kyun Kim Ph.D. (Committee Member); Sandeep Madireddy Ph.D. (Committee Member) Subjects: Mechanical Engineering
  • 15. Ma, Pulong Hierarchical Additive Spatial and Spatio-Temporal Process Models for Massive Datasets

    PhD, University of Cincinnati, 2018, Arts and Sciences: Mathematical Sciences

    Many geophysical processes evolve in space and time, resulting in complicated data including nonstationary and nonseparable covariance structures, and highly complex dynamics. With the advance of new remote-sensing technologies, massive amount of these datasets can be collected at very high spatial resolutions each day from satellite instruments. These data are often noisy and irregularly observed with incompatible supports as well. These challenges require new statistical methods to account for both model flexibility and computational efficiency. In this dissertation, three novel approaches are proposed: 1) the covariance function model that incorporates low-rank representation and spatial graphical model, to allow nonstationarity and robust predictive performance. This leads to a kriging methodology called fused Gaussian process (FGP); 2) the downscaling framework based on FGP to carry out conditional simulation to generate high-resolution fields; 3) the low-cost Bayesian inference framework for an additive covariance function model that combines any type of computational-complexity-reduction methods (e.g., low-rank representation) and separable covariance structure together, to allow nonseparability and good predictive performance. This leads to another kriging methodology called additive approximate Gaussian process (AAGP). The methodology in FGP relies on a small set of fixed spatial basis functions and random weights to model large-scale variation of a nonstationary process, and a Gaussian graphical model to capture remaining variation. This method is applied to analyze massive amount of remotely-sensed sea surface temperature data. Another important application based on FGP is to generate high-resolution nature runs in global observing system simulation experiments, which have been widely used to guide the development of new observing systems, and to evaluate the performance of new data assimilation algorithms. The change-of-support problem is handled exp (open full item for complete abstract)

    Committee: Emily Kang Ph.D. (Committee Chair); Bledar Konomi Ph.D. (Committee Chair); Shan Ba Ph.D. (Committee Member); Won Chang Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member) Subjects: Statistics
  • 16. Nguyen, Huong Near-optimal designs for Gaussian Process regression models

    Doctor of Philosophy, The Ohio State University, 2018, Statistics

    The Gaussian process (GP) regression model is a popular modeling framework in both spatial statistics and computer experiments. Our main goal is to find suitable designs for GP model predictions of outputs at unobserved locations; this goal can be interpreted as finding optimal designs that minimize the integrated mean square prediction errors (iMSPE) criterion function among all feasible designs. For most problems, there is no analytic solution to the minimization problem and the minimization step is done by some stochastic optimization algorithms. These algorithms are configured to search through the set of all feasible designs. At the conclusion of the search, the best design found is recommended. Although the recommended design is unlikely to be a true optimal design, the recommended design is expected to be close to one. Nevertheless, the current interpretation of the designs recommended by these algorithms does not include the uncertainty or the risks associated with this assumption (how close the recommended design is to a true optimal design). Most critically, we do not have a direct answer to important questions regarding the quality of the recommended designs (for example, whether the recommended design is significantly better than other feasible designs). In some cases, we even find that depending on the specifications of the problems and the optimization algorithm employed, the recommended designs can be undesirable. In this dissertation, we propose a new design generation and selection framework centering around a near-optimal design (NOD) concept. This new framework considers the natural range of iMSPE among all feasible designs and subsequently, reports the uncertainty about the quality of the recommended designs. Furthermore, by recognizing the sub-optimality of the majority of designs recommended by a number of stochastic optimization algorithms, we can implement more specific requirements about the minimal acceptable design quality, alleviating (open full item for complete abstract)

    Committee: Peter Craigmile (Advisor); Matthew Pratola (Advisor); Oksana Chkrebtii (Committee Member); Notz Bill (Committee Member) Subjects: Statistics
  • 17. Liu, Jinzhong Bayesian Inference for Treatment Effect

    PhD, University of Cincinnati, 2017, Arts and Sciences: Mathematical Sciences

    Evaluation of overall treatment effect and heterogeneity in treatment effect is of interest in both randomized clinical trials and in observational studies. In this thesis, we first develop a Bayesian approach to subgroup analysis using ANOVA models with multiple covariates. We assume a two-arm clinical trial with normally distributed response variable. The covariates are assumed categorical and a priori specified. The subgroups of interest are represented by a collection of models. And we use a model selection approach to find subgroups with heterogeneous effects. Then we propose a Bayesian semiparametric approach for estimating the population mean treatment effect with observational data using Gaussian process (GP), which accomplishes matching and modeling outcome mechanism in a single step. We demonstrate a close link between matching method and GP regression for estimating average treatment effect. The proposed method utilizes a distance similar to Mahalanobis distance but determines the range of matching automatically without imposing a caliper arbitrarily. Finally, we proposed a Bayesian semiparametric approach for predicting the heterogeneous treatment effect for new patients using two conditionally independent Gaussian processes (GP), one for response surface of control group, the other for treatment effect. The prediction can be used to visualize the treatment effect and help researchers investigate the pattern of the treatment effect for different patient baseline characteristics and hence decide whether the treatment is effective for patients with certain characteristics and possibly define a subgroup that the treatment is significantly effective on. We also illustrate the proposed methods using real data obtained from different studies.

    Committee: Siva Sivaganesan Ph.D. (Committee Chair); Bin Huang (Committee Member); Seongho Song Ph.D. (Committee Member); Xia Wang Ph.D. (Committee Member) Subjects: Statistics
  • 18. Shi, Hongxiang Hierarchical Statistical Models for Large Spatial Data in Uncertainty Quantification and Data Fusion

    PhD, University of Cincinnati, 2017, Arts and Sciences: Mathematical Sciences

    Modeling of spatial data often encounters computational bottleneck for large datasets and change- of-support effect for data at different resolutions. There is a rich literature on how to tackle these two problems but few gives a comprehensive solution for solving them together. This dissertation aims to develop hierarchical models that can alleviate those two problems together in uncertainty quantification and data fusion. For uncertainty quantification, a fully Bayesian hierarchical model combined with the nearest neighbor Gaussian process is proposed to produce consistent parameter inferences at different resolutions for a large spatial surface. Simulation studies demonstrate the ability of the proposed model to provide consistent parameter inferences at different resolutions with only a fraction of the computing time of the traditional method. This method is then applied to a real surface data. For data fusion, we propose a hierarchical model that can fuse two or more large spatial datasets with the exponential family of distributions. The ''change-of-support'' problem is handled along with the computational bottleneck by using a spatial random effect model for the underlying process. Through simulated and real data illustrations, the proposed data fusion method is demonstrated to possess predictive advantage over the univariate-process modeling approach by borrowing strength across processes.

    Committee: Emily Kang Ph.D. (Committee Chair); Hang Joon Kim Ph.D. (Committee Member); Bledar Konomi Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member); Xia Wang Ph.D. (Committee Member) Subjects: Statistics
  • 19. Hanandeh, Ahmad Nonstationary Nearest Neighbors Gaussian Process Models

    PhD, University of Cincinnati, 2017, Arts and Sciences: Mathematical Sciences

    Modeling is an essential part of research and development in almost every sphere of modern life. Computer models are frequently used to explore physical systems, but can be computationally expensive to evaluate (take days, weeks, or possibly months to run single simulation at one input value). In such settings, an emulator is used as a surrogate. Gaussian Process (GP) is a common and very useful way to develop emulators to describe the output of computer experiments and to describe computationally expensive simulations in uncertainty quantification. Recently, much attention has been paid about dealing with large datasets which can be found in various fields of the natural, social sciences and modern instruments. This resulted in an increasing need for methods to analyze large datasets. However, GP is nonparametric, meaning that the complexity of the model grows as more data points are received, and as a result, it faces several computational challenges for modeling large datasets, because of the need of calculating the inverse and determinant of large, dense and unstructured matrix; therefore we need alternative methods to analyze such large datasets. Various methods have been developed to deal with this problem, including a reduced rank approach and a sparse matrix approximation. However, most of them rely on unrealistic assumptions for the underlying process such as stationarity. We develop a new approximation

    Committee: Bledar Konomi Ph.D. (Committee Chair); Emily Kang Ph.D. (Committee Member); Hang Joon Kim Ph.D. (Committee Member); Siva Sivaganesan Ph.D. (Committee Member) Subjects: Statistics
  • 20. Turner, Jacob Improving the Sensitivity of a Pulsar Timing Array: Correcting for Interstellar Scattering Delays

    BA, Oberlin College, 2017, Physics and Astronomy

    The NANOGrav collaboration aims to detect low frequency gravitational waves by measuring the arrival times of radio signals from pulsars. A confirmation of such a gravitational wave signal requires timing tens of pulsars with a precision of better than 100 nanoseconds for around 10 – 25 years. A crucial component of the success of pulsar timing relies on understanding how the interstellar medium affects timing accuracy. Current pulsar timing models account only for the large-scale dispersion delays from the ISM. As a result, the relatively small-scale propagation effects caused by scattering are partially absorbed into the dispersion delay component of the model. In this thesis we developed a model that accounted for both dispersion delay and scattering delay. In addition to the two observable quantities used in the NANOGrav model, we also included the slopes of those two observables. We then simulated data describing motion of a pulsar through the interstellar medium over 11 years. We used a weighted linear least squares formalism to solve the system of four equations and two parameters at every epoch of measurement in order to remove the effects of dispersion and scattering from the data as fully as possible. This model was successful at removing these effects.

    Committee: D.R. Stinebring (Advisor) Subjects: Astronomy; Astrophysics; Physics