Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 10)

Mini-Tools

 
 

Search Report

  • 1. Ratnasingam, Suthakaran Sequential Change-point Detection in Linear Regression and Linear Quantile Regression Models Under High Dimensionality

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2020, Statistics

    Sequential change point analysis aims to detect structural change as quickly as possible when the process state changes. A good sequential change point detection procedure is expected to minimize the detection delay time and the risk of raising false alarm. Existing sequential change point detection methods cannot be applicable for high-dimensional data because they are univariate in nature and thus present challenges. In the first part of the dissertation, we develop a monitoring method to detect structural change in smoothly clipped absolute deviation (SCAD) penalized regression model for high-dimensional data after the historical sample with the sample size m. The unknown pre-change regression coefficients are replaced by the SCAD penalized estimator. The asymptotic properties of the proposed test statistics are derived. We conduct a simulation study to evaluate the performance of the propose method. The proposed method is applied to the gene expression in the mammalian eye data to detect changes sequentially. In the second part of the dissertation, we develop a sequential change point detection method to monitor structural changes in SACD penalized quantile regression (SPQR) model for high-dimensional data. We derive the asymptotic distributions of the test statistic under the null and alternative hypotheses. Furthermore, to improve the performance of the SPQR method, we propose the Post-SCAD penalized quantile regression estimator (P-SPQR) for high-dimensional data. Simulations are conducted under different scenarios to study the finite sample properties of the SPQR and P-SPQR methods. A real data application is provided to demonstrate the effectiveness of the method. In the third and fourth part of the dissertation, we investigate the change point problem for Skew-Normal distribution and three parameter Weibull distribution respectively. Besides detecting and obtaining the point estimate of a change location, we propose an estimation procedure based (open full item for complete abstract)

    Committee: Wei Ning PhD (Advisor); Andy Garcia PhD (Other); Hanfeng Chen PhD (Committee Member); Junfeng Shang PhD (Committee Member) Subjects: Statistics
  • 2. Girbino, Michael Detecting Distribution-Level Voltage Anomalies by Monitoring State Transitions in Voltage Regulation Control Systems

    Master of Sciences, Case Western Reserve University, 2019, EECS - System and Control Engineering

    This thesis describes the design and implementation of a finite state machine representation of a voltage regulation control system for a load tap-changing transformer. It also introduces a method for observing physical operating conditions based on sequences of state transitions within the control system. Simulation of a daily load pattern applied to a distribution network yields the probabilities of specific sequences occurring, which can be used to detect irregular behavior. State transition data correlated with an expected load pattern can be used to verify whether behavior is consistent with the sensor data observed by a system operator. Possible adverse scenarios include when either the operator or the control system is receiving falsified measurements, as would occur during a replay attack. The effectiveness of this technique is quantified by how quickly it can detect voltage anomalies as they occur.

    Committee: Kenneth Loparo PhD (Advisor); Vira Chankong PhD (Committee Member); Marija Prica PhD (Committee Member) Subjects: Electrical Engineering; Energy; Systems Design
  • 3. Tong, Xin Interactive Visual Clutter Management in Scientific Visualization

    Doctor of Philosophy, The Ohio State University, 2016, Computer Science and Engineering

    Scientists visualize their data and interact with them on computers in order to thoroughly understand them. Nowadays, data become so large and complex that it is impossible to display the entire data on a single image. Scientific visualization often suffers from visual clutter problem because of high spacial resolution/dimension and temporal resolution. Interacting with the visualizations of large data, on the other hand, allows users to dynamically explore different parts of the data and gradually understand all information in the data. Information congestion and visual clutter exist in visualizations of different kinds of data, such as flow field data, tensor field data, and time-varying data. Occlusion presents a major challenge in visualizing 3D flow and tensor fields using streamlines. Displaying too many streamlines creates a dense visualization filled with occluded structures, but displaying too few streams risks losing important features. Glyph as a powerful multivariate visualization technique is used to visualize data through its visual channels. Placing large number of glyphs over the entire 3D space results in occlusion and visual clutter that make the visualization ineffective. To avoid the occlusion in streamline and glyph visualization, we propose a view-dependent interactive 3D lens that removes the occluding streamlines/glyphs by pulling the them aside through animations. High resolution simulations are capable of generating very large vector fields that are expensive to store and analyze. In addition, the noise and/or uncertainty contained in the data often affects the quality of visualization by producing visual clutter that interferes with both the interpretation and identification of important features. Instead, we can store the distributions of many vector orientations and visualize the distributions with 3D glyphs, which largely reduce visual clutter. Empowered by rapid advance of high performance computer architectures and software, it is (open full item for complete abstract)

    Committee: Han-Wei Shen (Advisor); Huamin Wang (Committee Member); Arnab Nandi (Committee Member) Subjects: Computer Engineering; Computer Science
  • 4. Yajima, Ayako Assessment of Soil Corrosion in Underground Pipelines via Statistical Inference

    Doctor of Philosophy, University of Akron, 2015, Civil Engineering

    In the oil industry, underground pipelines are the most preferred means of transporting a large amount of liquid product. However, a considerable number of unforeseen incidents due to corrosion failure are reported each year. Since corrosion in underground pipelines is caused by physicochemical interactions between the material (steel pipeline) and the environment (soil), the assessment of soil as a corrosive environment is indispensable. Because of the complex characteristics of soil as a corrosion precursor influencing the dissolution process, soil cannot be explained fully by conventional semi-empirical methodologies defined in controlled settings. The uncertainties inherited from the dynamic and heterogeneous underground environment should be considered. Therefore, this work presents the unification of direct assessment of soil and in-line inspection (ILI) with a probabilistic model to categorize soil corrosion. To pursue this task, we employed a model-based clustering analysis via Gaussian mixture models. The analysis was performed on data collected from southeastern Mexico. The clustering approach helps to prioritize areas to be inspected in terms of underground conditions and can improve repair decision making beyond what is offered by current assessment methodologies. This study also addresses two important issues related to in-situ data: missing data and truncated data. The typical approaches for treating missing data utilized in civil engineering are ad hoc methods. However, these conventional approaches may cause several critical problems such as biased estimates, artificially reduced variance, and loss of statistical power. Therefore, this study presents a variant of EM algorithms called Informative EM (IEM) to perform clustering analysis without filling in missing values prior to the analysis. This model-based method introduces additional cluster-specific Bernoulli parameters to exploit the nonuniformity of the frequency of missing values across cl (open full item for complete abstract)

    Committee: Robert Liang Dr. (Advisor); Chien-Chun Chan Dr. (Committee Member); Junliang Tao Dr. (Committee Member); Guo-Xiang Wang Dr. (Committee Member); Lan Zhang Dr. (Committee Member) Subjects: Civil Engineering
  • 5. Fan, Huihao Test of Treatment Effect with Zero-Inflated Over-Dispersed Count Data from Randomized Single Factor Experiments

    PhD, University of Cincinnati, 2014, Medicine: Biostatistics (Environmental Health)

    Real-life count data are frequently characterized by over-dispersion (variance greater than mean) and excess zeros. Various methods exist in literature to combat zero-inflation and over-dispersion in count data. Among them Zero-inflated count models provide a parsimonious yet powerful way to model excess zeros in addition to allowing for over-dispersion. Such models assume that the counts are a mixture of two separate data generation process: one generates only zeros, and the other is a Poisson type data-generating process. Among mostly discussed models are zero-inflated Poisson (ZIP), zero inflated negative binomial (ZINB) and zero-inflated generalized Poisson (ZIGP). However, the performance and application condition of these models are not thoroughly studied. In this work, these common zero-inflation models are reviewed and compared under specified over-dispersion conditions via simulated data and real-life data in terms of statistical power and type I error rate. Performance of each model will be listed side by side to give a clear view of each model's pros and cons in specific over-dispersion and zero-inflation condition. Further, the ZIGP model is chosen to extend to a more general situation where a random effect is incorporated to account for within-subject correlation and between subject heterogeneity. Likelihood based estimation of treatment effect will be developed for analysis of randomized experiments with random effect. Effect of model misspecification on model's performance will be investigated in areas such as type I error rate, standard error and empirical statistical power. Case studies will be presented to illustrate the application these models.

    Committee: Marepalli Rao Ph.D. (Committee Chair); Jianfei Guo Ph.D. (Committee Member); Rakesh Shukla Ph.D. (Committee Member); Paul Succop Ph.D. (Committee Member); Changchun Xie Ph.D. (Committee Member) Subjects: Biostatistics
  • 6. Huang, Tao An Internet Based GIS Database Distribution System

    MCP, University of Cincinnati, 2001, Design, Architecture, Art, and Planning : Community Planning

    Geographic information system (GIS) has the power of facilitating the spatial perception of plans. The visualization of information and planning scenarios by means of integrated text and maps gives the context and spatial perception of the information content to a user. Although the interactive mapping or Internet GIS has developed rapidly over the past few years, there is not a planning oriented GIS system that shows how spatial information can be used to inform a given interest group or general public through Internet. This project designs a user-friendly interface to distribute GIS related data to general users or GIS technical users who can directly access the GIS database through Web sites and view maps for their area of interest. It not only provides general map viewing functions, such as "zoom in", "zoom out", and "identify", but also provides advanced functions as thematic mapping, map querying and map searching. The advanced mapping functions make this data distribution system quiet different from other commercial or governmental mapping systems. It allows people directly access the GIS database and distribute the related GIS data upon their requests. With this system, a planner can inform more detailed information about planning issues to more people than the traditional way – public meeting can do. Additionally, all information is presented not only by memos or letters, but also by colorful, analyzable maps. The three-scale function feature in this project also allows general public select any area, any information they are interested, from metropolitan transportation network systems of the OKI Region to a land use limitation of a single parcel. Because all data utilized by this project come from Cincinnati Area Geographic Information Systems (CAGIS), who declaimed the data only allow to use for educational purposes. Although the map service system has ability to do more planning related researches and analysis, but currently, the map service area only co (open full item for complete abstract)

    Committee: Xinhao Wang (Advisor) Subjects: Urban and Regional Planning
  • 7. ZHONG, WEI STATISTICAL APPROACHES TO ANALYZE CENSORED DATA WITH MULTIPLE DETECTION LIMITS

    PhD, University of Cincinnati, 2005, Medicine : Biostatistics (Environmental Health)

    Censored data with multiple detection limits frequently arise in environmental health studies, where data are collected by different sampling and measured by different analytical procedures or when data are combined from multiple laboratories. The substitution method is the most common approach to handle censored environmental data; however, this method lacks a theoretical basis and results can differ substantially depending on the substituted value. Maximum likelihood estimation (MLE) with the Expectation-Maximization (EM) algorithm integrated method and the meta-analysis method were studied to determine if they can overcome these problems and to incorporate the sample collection process into the estimation of summary statistics. A new likelihood-based Z-score test and a resampling-based permutation test were introduced as well to compare the means of two censored data groups. They were expected to provide higher power and closer type I error rates to the nominal level than the usual two-sample t-test. The proposed methods were evaluated through a series of simulation studies and their performances were compared to those of the conventional methods. Simulation results consistently showed that the proposed MLE with the EM algorithm integrated method and the meta-analysis method provided the most accurate and efficient estimation of summary statistics for censored data with multiple detection limits. The simulation also suggested that the amount of censoring, magnitude of variance and disparities of sample size influenced the statistical estimation. The proposed Z-score test and permutation test were superior to the usual two-sample t-test. They provided better power and type I error rates in the simulation studies, and thus should be recommended for the comparison of means between two censored data groups. The proposed methods were successfully applied to two data sets collected by environmental health studies; the obtained summary statistics and significant test re (open full item for complete abstract)

    Committee: Dr. Rakesh Shukla (Advisor) Subjects:
  • 8. Dong, Fanglong Bayesian Model Checking in Multivariate Discrete Regression Problems

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2008, Mathematics/Probability and Statistics

    Ordinal data are common in the academic area such as a student grade, A, B, C, D, orF, also ordinal data are common in other area such as customer satisfaction survey. It is straightforward to fit a regression model to reflect the relationship between the response and the predictors. Since the response in an ordinal data set is a vector, it is not clear how the traditional statistics define residuals and detect outliers because of the dimension of response. Since the introduction of latent variable, we can model the data using the latent variable and we have a new type of residual called latent residual. With the help of introduction of latent variable into the model, it is easy to define residuals and detect outliers. In practice there are usually more than one predictor in the data set and we need to decide to choose variable that should be included in the model. We look at from a frequentist's perspective and a Bayesian perspective. Also when we fit a model to a data set, we care about how well this model fit the data set, and we look from both a frequentist's perspective and a Bayesian perspective. Usually methods from a frequentist's perspective rely on the asymptotic distribution to draw a conclusion and sometime this will become a problem especially when the sample size is small, on the contrary, methods from a Bayesian perspective use simulation and thus it removes the reliance on the asymptotic distribution. Chapter 3 talks about methods for outlier detection problems and Chapter 4 talks about goodness-of-fit and model selection problems, in Chapter 5 we apply the methods from Chapter 3 and Chapter 4 to the BGSU student data set. Chapter 6 summarized the whole dissertation and possible future research interest and applications.

    Committee: James Albert (Advisor); James Albert (Advisor); Madhu Rao (Committee Member); Hanfeng Chen (Committee Member); John Chen (Committee Member) Subjects: Statistics
  • 9. Li, Hailong Analytical Model for Energy Management in Wireless Sensor Networks

    PhD, University of Cincinnati, 2013, Engineering and Applied Science: Computer Science and Engineering

    Wireless sensor networks (WSNs) are one type of ad hoc networks with data-collecting function. Because of the low-power, low-cost features, WSN attracts much attention from both academia and industry. However, since WSN is driven by batteries and the multi-hop transmission pattern introduces energy hole problem, energy management of WSN became one of fundamental issues. In this dissertation, we study the energy management strategies for WSNs. Firstly, we propose a packets propagation scheme for both deterministic and random deployment of WSNs so to prolong their lifetime. The essence of packets propagation scheme is to control transmission power so as to balance the energy consumption for the entire WSN. Secondly, a characteristic correlation based data aggregation approach is presented. Redundant information during data collection can be effectively mitigated so as to reduce the packets transmission in the WSN. Lifetime of WSN is increased with limited overhead. Thirdly, we also provide a two-tier lifetime optimization strategy for wireless visual sensor network (VSN). By deploying redundant cheaper relay nodes into existing VSN, the lifetime of VSN is maximized with minimal cost. Fourthly, our two-tier visual sensor network deployment is further extended considering multiple base stations and image compression technique. Last but not the least, description of UC AirNet WSN project is presented. At the end, we also consider future research topics on energy management schemes for WSN.

    Committee: Dharma Agrawal D.Sc. (Committee Chair); Kenneth Berman Ph.D. (Committee Member); Yizong Cheng Ph.D. (Committee Member); Chia Han Ph.D. (Committee Member); Wen Ben Jone Ph.D. (Committee Member) Subjects: Computer Engineering
  • 10. Pailden, Junvie Applications of Empirical Likelihood to Zero-Inflated Data and Epidemic Change Point

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2013, Statistics

    Many studies in health care deal with zero-inflated data sets characterized by a significant proportion of zero and highly skewed positive values. Although it is a common practice to use the median instead of the mean as the measure of central location in skewed data, many applications require the use of the mean. For instance, the mean can be used to recover the total medical cost which reflects the entire expenditure on health care in a given patient population. For testing the value of a mean, the empirical likelihood method offers the benefit of making no distributional assumptions beyond some mild moment conditions while retaining the same advantages that parametric likelihood based tests enjoyed. In this dissertation, we proposed an empirical likelihood ratio test for the difference between means of two zero-inflated samples. The proposed test was derived by jointly specifying the empirical likelihood for the mean parameter and the probability of taking zero value in the data. There are two unique features in this procedure. One is that the information contained in the zero observations is fully utilized and that the proposed test is insensitive to the skewness of the non-zero observations. We derive an asymptotic distribution that will be used to calibrate the statistic in testing the null hypothesis of no mean difference. We also extend the procedure to testing the mean equality of several independent zero-inflated populations. As a benchmark for comparison against conventional tests, we investigate the empirical type 1 error and power rates in finite sample settings. Both the proposed two sample test for the mean difference and the equality of means between three or more populations exhibits comparable if not superior finite sample performance. Another application of empirical likelihood approach that we consider is on detecting epidemic change point in a sequence of observations. Under some mild conditions, the asymptotic null distribution of the te (open full item for complete abstract)

    Committee: Hanfeng Chen (Advisor); Eric Worch (Committee Member); John Chen (Committee Member); Wei Ning (Committee Member) Subjects: Statistics