Search Results (1 - 10 of 10 Results)

Sort By  
Sort Dir
 
Results per page  

Chen, JitongOn Generalization of Supervised Speech Separation
Doctor of Philosophy, The Ohio State University, 2017, Computer Science and Engineering
Speech is essential for human communication as it not only delivers messages but also expresses emotions. In reality, speech is often corrupted by background noise and room reverberation. Perceiving speech in low signal-to-noise ratio (SNR) conditions is challenging, especially for hearing-impaired listeners. Therefore, we are motivated to develop speech separation algorithms to improve intelligibility of noisy speech. Given its many applications, such as hearing aids and robust automatic speech recognition (ASR), speech separation has been an important problem in speech processing for decades. Speech separation can be achieved by estimating the ideal binary mask (IBM) or ideal ratio mask (IRM). In a time-frequency (T-F) representation of noisy speech, the IBM preserves speech-dominant T-F units and discards noise-dominant ones. Similarly, the IRM adjusts the gain of each T-F unit to suppress noise. As such, speech separation can be treated as a supervised learning problem where one estimates the ideal mask from noisy speech. Three key components of supervised speech separation are learning machines, acoustic features and training targets. This supervised framework has enabled the treatment of speech separation with powerful learning machines such as deep neural networks (DNNs). For any supervised learning problem, generalization to unseen conditions is critical. This dissertation addresses generalization of supervised speech separation. We first explore acoustic features for supervised speech separation in low SNR conditions. An extensive list of acoustic features is evaluated for IBM estimation. The list includes ASR features, speaker recognition features and speech separation features. In addition, we propose the Multi-Resolution Cochleagram (MRCG) feature to incorporate both local information and broader spectrotemporal contexts. We find that gammatone-domain features, especially the proposed MRCG features, perform well for supervised speech separation at low SNRs. Noise segment generalization is desired for noise-dependent speech separation. When tested on the same noise type, a learning machine needs to generalize to unseen noise segments. For nonstationary noises, there exists a considerable mismatch between training and testing segments, which leads to poor performance during testing. We explore noise perturbation techniques to expand training noise for better generalization. Experiments show that frequency perturbation effectively reduces false-alarm errors in mask estimation and leads to improved objective metrics of speech intelligibility. Speech separation in unseen environments requires generalization to unseen noise types, not just noise segments. By exploring large-scale training, we find that a DNN based IRM estimator trained on a large variety of noises generalizes well to unseen noises. Even for highly nonstationary noises, the noise-independent model achieves similar performance as noise-dependent models in terms of objective speech intelligibility measures. Further experiments with human subjects lead to the first demonstration that supervised speech separation improves speech intelligibility for hearing-impaired listeners in novel noises. Besides noise generalization, speaker generalization is critical for many applications where target speech may be produced by an unseen speaker. We observe that training a DNN with many speakers leads to poor speaker generalization. The performance on seen speakers degrades as additional speakers are added for training. Such a DNN suffers from the confusion of target speech and interfering speech fragments embedded in noise. We propose a model based on recurrent neural network (RNN) with long short-term memory (LSTM) to incorporate the temporal dynamics of speech. We find that the trained LSTM keeps track of a target speaker and substantially improves speaker generalization over DNN. Experiments show that the proposed model generalizes to unseen noises, unseen SNRs and unseen speakers.

Committee:

DeLiang Wang (Advisor); Eric Fosler-Lussier (Committee Member); Eric Healy (Committee Member)

Subjects:

Computer Science; Engineering

Keywords:

Speech separation; speech intelligibility; computational auditory scene analysis; mask estimation; supervised learning; deep neural networks; acoustic features; noise generalization; SNR generalization; speaker generalization;

Chen, ZhiangDeep-learning Approaches to Object Recognition from 3D Data
Master of Sciences, Case Western Reserve University, 2017, EMC - Mechanical Engineering
This thesis focuses on deep-learning approaches to recognition and pose estimation of graspable objects using depth information. Recognition and orientation detection from depth-only data is encoded by a carefully designed 2D descriptor from 3D point clouds. Deep-learning approaches are explored from two main directions: supervised learning and semi-supervised learning. The disadvantages of supervised learning approaches drive the exploration of unsupervised pretraining. By learning good representations embedded in early layers, subsequent layers can be trained faster and with better performance. An understanding of learning processes from a probabilistic perspective is concluded, and it paves the way for developing networks based on Bayesian models, including Variational Auto-Encoders. Exploitation of knowledge transfer--re-using parameters learned from alternative training data--is shown to be effective in the present application.

Committee:

Wyatt Newman, PhD (Advisor); M. Cenk Çavusoglu, PhD (Committee Member); Roger Quinn, PhD (Committee Member)

Subjects:

Computer Science; Medical Imaging; Nanoscience; Robotics

Keywords:

deep learning; 3D object recognition; semi-supervised learning; knowledge transfer

Que, QichaoIntegral Equations For Machine Learning Problems
Doctor of Philosophy, The Ohio State University, 2016, Computer Science and Engineering
Supervised learning algorithms have achieved significant success in the last decade. To further improve learning performance, we still need to have a better understanding of semi-supervised learning algorithms for leveraging a large amount of unlabeled data. In this dissertation, a new approach for semi-supervised learning will be discussed, which takes advantage of unlabeled data information through an integral operator associated with a kernel function. More specifically, several problems in machine learning are formulated as a regularized Fredholm integral equation, which has been well studied in the literature of inverse problems. Under this framework, we propose several simple and easily implementable algorithms with sound theoretical guarantees. First, a new framework for supervised learning is proposed, referred as Fredholm learning. It allows a natural way to incorporate unlabeled data and is flexible on the choice of regularizations. In particular, we connect this new learning framework to the classical algorithm of radial basis function networks, and more specifically, analyze two common forms of regularization procedures for RBF networks, one based on the square norm of coefficients in a network and another one using centers obtained by the k-means clustering. We provide a theoretical analysis of these methods as well as a number of experimental results, pointing out very competitive empirical performance as well as certain advantages over the standard kernel methods in terms of both flexibility (incorporating unlabeled data) and computational complexity. Moreover, the Fredholm learning algorithm could be interpreted as a special form of kernel methods using a data-dependent kernel. Our analysis shows that Fredholm kernels achieve noise suppressing effects under a new assumption for semi-supervised learning, termed the "noise assumption". We also address the problem of estimating the probability density ratio function q/p, which could be used for solving the {\it covariate shift} problem in transfer learning, given the marginal distribution p for training data and q for testing data. Our approach is based on reformulating the problem of estimating q/p as an inverse problem in terms of a Fredholm integral equation. This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically. The resulting family of algorithms, termed the FIRE algorithm for the Fredholm Inverse Regularized Estimator, is flexible, simple and easy to implement. More importantly, several encouraging experimental results are presented, especially applications to classification and semi-supervised learning within the covariate shift framework. We also show how the hyper-parameters in the FIRE algorithm can be chosen in a completely unsupervised manner.

Committee:

Mikhail Belkin (Advisor); Wang Yusu (Committee Member); Wang DeLiang (Committee Member); Lee Yoonkyung (Committee Member)

Subjects:

Computer Science

Keywords:

Machine Learning; RBF Networks; Supervised Learning; Kernel Methods; Fredholm Equations; Covariate Shift

Jin, ZhaozhangMonaural Speech Segregation in Reverberant Environments
Doctor of Philosophy, The Ohio State University, 2010, Computer Science and Engineering

Room reverberation is a major source of signal degradation in real environments. While listeners excel in "hearing out" a target source from sound mixtures in noisy and reverberant conditions, simulating this perceptual ability remains a fundamental challenge. The goal of this dissertation is to build a computational auditory scene analysis (CASA) system that separates target voiced speech from its acoustic background in reverberant environments. A supervised learning approach to pitch-based grouping of reverberant speech is proposed, followed by a robust multipitch tracking algorithm based on a hidden Markov model (HMM) framework. Finally, a monaural CASA system for reverberant speech segregation is designed by combining the supervised learning approach and the multipitch tracker.

Monaural speech segregation in reverberant environments is a particularly challenging problem. Although inverse filtering has been proposed to partially restore the harmonicity of reverberant speech before segregation, this approach is sensitive to specific source/receiver and room configurations. Assuming that the true target pitch is known, our first study lends to a novel supervised learning approach to monaural segregation of reverberant voiced speech, which learns to map a set of pitch-based auditory features to a grouping cue encoding the posterior probability of a time-frequency (T-F) unit being target dominant given observed features. We devise a novel objective function for the learning process, which directly relates to the goal of maximizing signal-to-noise ratio. The model trained using this objective function yields significantly better T-F unit labeling. A segmentation and grouping framework is utilized to form reliable segments under reverberant conditions and organize them into streams. Systematic evaluations show that our approach produces very promising results under various reverberant conditions and generalizes well to new utterances and new speakers.

Multipitch tracking in real environments is critical for speech signal processing. Determining pitch in both reverberant and noisy conditions is another difficult task. In the second study, we propose a robust algorithm for multipitch tracking in the presence of background noise and room reverberation. A new channel selection method is utilized to extract periodicity features. We derive pitch scores for each pitch state, which estimate the likelihoods of the observed periodicity features given pitch candidates. An HMM integrates these pitch scores and searches for the best pitch state sequence. Our algorithm can reliably detect single and double pitch contours in noisy and reverberant conditions.

Building on the first two studies, we propose a CASA approach to monaural segregation of reverberant voiced speech, which performs multipitch tracking of reverberant mixtures and supervised classification. Speech and nonspeech models are separately trained, and each learns to map pitch-based features to the posterior probability of a T-F unit being dominated by the source with the given pitch estimate. Because interference can be either speech or nonspeech, a likelihood ratio test is introduced to select the correct model for labeling corresponding T-F units. Experimental results show that the proposed system performs robustly in different types of interference and various reverberant conditions, and has a significant advantage over existing systems.

Committee:

DeLiang Wang, PhD (Advisor); Eric Fosler-Lussier, PhD (Committee Member); Mikhail Belkin, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

computational auditory scene analysis; monaural segregation; multipitch tracking; pitch determination algorithm; room reverberation; speech separation; supervised learning

CAO, BAOQIANGON APPLICATIONS OF STATISTICAL LEARNING TO BIOPHYSICS
PhD, University of Cincinnati, 2007, Arts and Sciences : Physics
In this dissertation, we develop statistical and machine learning methods for problems in biological systems and processes. In particular, we are interested in two problems–predicting structural properties for membrane proteins and clustering genes based on microarray experiments. In the membrane protein problem, we introduce a compact representation for amino acids, and build a neural network predictor based on it to identify transmembrane domains for membrane proteins. Membrane proteins are divided into two classes based on the secondary structure of the parts spanning the bilayer lipids: alpha-helical and beta-barrel membrane proteins. We further build a support regression model to predict the lipid exposed levels for the amino acids within the transmembrane domains in alpha-helical membrane proteins. We also develop methods to predict pore-forming residues for beta-barrel membrane proteins. In the other problem, we apply a context-specific Bayesian clustering model to cluster genes based on their expression levels and cDNA copy numbers. This dissertation is organized as follows. Chapter 1 introduces the most relevant biology and statistical and machine learning methods. Chapters 2 and 3 focus on prediction of transmembrane domains for the alpha-helix and the beta-barrel, respectively. Chapter 4 discusses the prediction of relative lipid accessibility, a different structural property for membrane proteins. The final chapter addresses the gene clustering approach.

Committee:

Dr. Mark Jarrell (Advisor)

Subjects:

Physics, Molecular

Keywords:

membrane protein; transmembrane domains; supervised learning; unsupervised learning; neural networks; support vector machine; linear regression; classification; gene clustering; relative lipid accessibility; Bayesian inference; hierarchical clustering

Sinha, KaushikNew Directions in Gaussian Mixture Learning and Semi-supervised Learning
Doctor of Philosophy, The Ohio State University, 2010, Computer Science and Engineering

High dimensional data analysis involves, among many other tasks, modeling the unknown underlying data generating process and predicting one of the few possible sources that might have generated the data. If data have some underlying pattern or structure, it is conceivable that such a hidden pattern or structure might provide important cues toward predicting the possible data generating source. On the other hand, for modeling purposes, it is important to understand the computational aspects of the problem of learning model parameters, especially the dependence on dimension. This thesis addresses the computational aspects of such a modeling problem especially in high dimension and also describes a novel predictive framework that exploits the underlying structure/pattern present in data.

The first part of this thesis addresses the problem of Gaussian mixture learning. Gaussian mixture model is a fundamental model in applied statistics and is a very popular choice for many scientific/engineering modeling tasks. However, computational aspects of learning the parameters of this model, especially in high dimension, is not well understood. The question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and learning theory communities. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The result presented in this thesis resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary but fixed number of components.

The second part of this thesis describes studies on semi-supervised learning for multi-modal data, especially when data have natural clusters. Many real world datasets in scientific/engineering applications have clusters, e.g., hand written digit dataset or breast cancer dataset. To deal with such multi-modal data, this thesis proposes a novel framework which establishes a natural connection between “cluster assumption” in semi-supervised learning and sparse approximation. Specifically, when the number of labeled data is limited, the proposed method can learn a classifier which has a sparse representation in an appropriate set of bases and whose performance is comparable to the state of the art semi-supervised learning algorithms on a number of real world datasets.

Committee:

Mikhail Belkin, PhD (Advisor); DeLiang Wang, PhD (Committee Member); Gagan Agrawal, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

Gaussian Mixture Learning; Semi-supervised Learning

Mirzaei, GolrokhData Fusion of Infrared, Radar, and Acoustics Based Monitoring System
Doctor of Philosophy, University of Toledo, 2014, Engineering
Many birds and bats fatalities have been reported in the vicinity of wind farms. An acoustic, infrared camera, and marine radar based system is developed to monitor the nocturnal migration of birds and bats. The system is deployed and tested in an area of potential wind farm development. The area is also a stopover for migrating birds and bats. Multi-sensory data fusion is developed based on acoustics, infrared camera (IR), and radar. The diversity of the sensors technologies complicated its development. Different signal processing techniques were developed for processing of various types of data. Data fusion is then implemented from three diverse sensors in order to make inferences about the targets. This approach leads to reduction of uncertainties and provides a desired level of confidence and detail information about the patterns. This work is a unique, multifidelity, and multidisciplinary approach based on pattern recognition, machine learning, signal processing, bio-inspired computing, probabilistic methods, and fuzzy reasoning. Sensors were located in the western basin of Lake Erie in Ohio and were used to collect data over the migration period of 2011 and 2012. Acoustic data were collected using acoustic detectors (SM2 and SM2BAT). Data were preprocessed to convert the recorded files to standard wave format. Acoustic processing was performed in two steps: feature extraction, and classification. Acoustic features of bat echolocation calls were extracted based on three different techniques: Short Time Fourier Transform (STFT), Mel Frequency Cepstrum Coefficient (MFCC), and Discrete Wavelet Transform (DWT). These features were fed into an Evolutionary Neural Network (ENN) for their classification at the species level using acoustic features. Results from different feature extraction techniques were compared based on classification accuracy. The technique can identify bats and will contribute towards developing mitigation procedures for reducing bat fatalities. Infrared videos were collected using thermal IR camera (FLIR SR 19). Pre-processing was performed to convert infrared videos to frames. Three different background subtraction techniques were applied to detect moving objects in IR data. Thresholding was performed for image binarization using extended Otsu Threshold. Morphology was performed for noise suppression and filtering. Results of three different techniques were then compared. Selected technique (Running Average) followed by thresholding and filtering is then used for tracking and information extraction. Ant based Clustering Algorithm (ACA) based on Lumer and Faieta with its three different variations including Standard ACA, Different Speed ACA, and Short Memory ACA were implemented over extracted features and were compared in terms of different groups created for detected avian data. Fuzzy C Means (FCM) was implemented and used to group the targets. Radar data were collected using Furuno marine radar (XANK250) with T-bar antenna and parabolic dish. The target detection was processed using radR which is open source platform available for recording and processing radar data. This platform was used to remove clutter and noise, detect possible targets in terms of blip, and save the blips information. The tracking algorithm was developed based on estimation and data association, independent from radR. Estimation is performed using Sequential Importance Sampling-based Particle Filter (SIS-PF) and data association is performed using the Nearest Neighbors (NN). The data fusion was performed in a heterogeneous dissimilar sensory environment. This is a challenging environment which needs many efforts in both setting up experiments and algorithmic development. Setting up experiments includes preparation of the equipment including purchase of the required equipment, installing the systems, configuration, and control parameter setting. The algorithmic development includes developing algorithms and use of the best available technique for this specific application. Various trade-off of time, accuracy, and cost were considered. Data fusion of the acoustics/IR/radar is a hierarchical model based on two levels: Level 1 and Level 2. Level 1 is a homogenous dissimilar fusion based on feature level fusion. Level 2 is a heterogeneous fusion and is based on decision level fusion. The feature level is employed on the IR and radar data and combines the features of detected /tracked targets into a composite feature vector. The constructed feature vector is an end-to-end individual sensors’ feature vector which serves as an input to the next level. The second level is a decision level, which uses the feature vector from L1 and fuses the data with acoustic data. The fusion was developed based on number of fusion functions. Data alignment including temporal and spatial alignment, and target association was implemented. A fuzzy Bayesian fusion technique was developed for decision level fusion, the fuzzy inference system provides the priori probability, and Bayesian inference provides posteriori probability of the avian targets. The result of the data fusion was used to process the spring and fall 2011 migration time in the western basin of Lake Erie in Ohio. This area is a landscape is in the prevailing wind and is putative for wind turbine construction. Also this area is a stopover for migrant birds/bats and the presence of wind turbines may threatened their habitats and life. The aim of this project is to provide an understanding of the activity and behavior of the biological targets by combining three different sensors and provide a detail and reliable information. This work can be extend to other application of military, industry, medication, traffic control, etc.

Committee:

Mohsin Jamali, Dr. (Committee Chair); Jackson Carvalho, Dr. (Committee Member); Mohammed Niamat, Dr. (Committee Member); Richard Molyet, Dr. (Committee Member); Mehdi Pourazady, Dr. (Committee Member)

Subjects:

Biology; Computer Engineering; Computer Science; Ecology; Electrical Engineering; Energy; Engineering

Keywords:

Acoustics,Evolutionary Neural Network,Infrared Camera,Radar,Data Fusion,Clustering, Classification,Supervised Learning,Unsupervised Learning,Feature Extraction,Bat Echolocation Call,Wind Turbine,Bird Mortality,Fuzzy,Bayesian,Detection,Identification

Han, KunSupervised Speech Separation And Processing
Doctor of Philosophy, The Ohio State University, 2014, Computer Science and Engineering
In real-world environments, speech often occurs simultaneously with acoustic interference, such as background noise or reverberation. The interference usually leads to adverse effects on speech perception, and results in performance degradation in many speech applications, including automatic speech recognition and speaker identification. Monaural speech separation and processing aim to separate or analyze speech from interference based on only one recording. Although significant progress has been made on this problem, it is a widely regarded challenge. Unlike traditional signal processing, this dissertation addresses the speech separation and processing problems using machine learning techniques. We first propose a classification approach to estimate the ideal binary mask (IBM) which is considered as a main goal of sound separation in computational auditory scene analysis (CASA). We employ support vector machines (SVMs) to classify time-frequency (T-F) units as either target-dominant or interference-dominant. A rethresholding method is incorporated to improve classification results and maximize hit minus false alarm rates. Systematic evaluations show that the proposed approach produces accurate estimated IBMs. In a supervised learning framework, the issue of generalization to conditions different from those in training is very important. We then present methods that require only a small training corpus and can generalize to unseen conditions. The system utilizes SVMs to learn classification cues and then employs a rethresholding technique to estimate the IBM. A distribution fitting method is introduced to generalize to unseen signal-to-noise ratio conditions and voice activity detection based adaptation is used to generalize to unseen noise conditions. In addition, we propose to use a novel metric learning method to learn invariant speech features in the kernel space. The learned features encode speech-related information and can generalize to unseen noise conditions. Experiments show that the proposed approaches produce high quality IBM estimates under unseen conditions. Besides background noise, room reverberation is another major source of signal degradation in real environments. Reverberation when combined with background noise is particularly disruptive for speech perception and many applications. We perform dereverberation and denoising using supervised learning. A deep neural network (DNN) is trained to directly learn a spectral mapping from the spectrogram of corrupted speech to that of clean speech. The spectral mapping approach substantially attenuates the distortion caused by reverberation and background noise, leading to improvement of predicted speech intelligibility and quality scores, as well as speech recognition rates. Pitch is one of the most important characteristics of speech signals. Although pitch tracking has been studied for decades, it is still challenging to estimate pitch from speech in the presence of strong noise. We estimate pitch using supervised learning, where probabilistic pitch states are directly learned from noisy speech data. We investigate two alternative neural networks modeling pitch state distribution given observations, i.e., a feedforward DNN and a recurrent deep neural network (RNN). Both DNNs and RNNs produce accurate probabilistic outputs of pitch states, which are then connected into pitch contours by Viterbi decoding. Experiments show that the proposed algorithms are robust to different noise conditions.

Committee:

DeLiang Wang (Advisor); Eric Fosler-Lussier (Committee Member); Mikhail Belkin (Committee Member)

Subjects:

Computer Science

Keywords:

Supervised learning; Speech separation; Speech processing; Machine learning; Deep Learning; Pitch estimation; Speech Dereverberation; Deep neural networks; Support vector machines

VANCE, DANNY W.AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING
PhD, University of Cincinnati, 2006, Engineering : Computer Science and Engineering
The objective of supervised learning is to estimate unknowns based on labeled training samples. For example, one may have aerial spectrographic readings for a large field planted in corn. Based on spectrographic observation, one would like to determine whether the plants in part of the field are weeds or corn. Since the unknown to be estimated is categorical or discrete, the problem is one of classification. If the unknown to be estimated is continuous, the problem is one of regression or numerical estimation. For example, one may have samples of ozone levels from certain points in the atmosphere. Based on those samples, one would like to estimate the ozone level at other points in the atmosphere. Algorithms for supervised learning are useful tools in many areas of agriculture, medicine, and engineering, including estimation of proper levels of nutrients for cows, prediction of malignant cancer, document analysis, and speech recognition. A few general references on supervised learning include [1], [2], [3], and [4]. Two recent reviews of the supervised learning literature are [5] and [6]. In general, univariate learning tree algorithms have been particularly successful in classification problems, but they can suffer from several fundamental difficulties, e.g., "a representational limitation of univariate decision trees: the orthogonal splits to the feature's axis of the sample space that univariate tree rely on" [8] and overfit [17]. In this thesis, we present a classification procedure for supervised classification that consists of a new univariate decision tree algorithm (Margin Algorithm) and two other related algorithms (Hyperplane and Box Algorithms). The full algorithm overcomes all of the usual limitations of univariate decision trees and is called the Paired Planes Classification Procedure. The Paired Planes Classification Procedure is compared to Support Vector Machines, K-Nearest Neighbors, and decision trees. The Hyperplane Algorithm allows direct user input as to acceptable error for each class as contrasted with indirect input (through use of a slack variable) with Support Vector Machines. Theoretical and real-life datasets results are shown. Experiments on real-life datasets show that error rates are in some circumstances lower than these supervised learning algorithms, while usually being computationally less expensive by an order of magnitude (or more).

Committee:

Dr. Anca Ralescu (Advisor)

Keywords:

machine learning; supervised learning; support vector machine; k-nearest neighbors; decision tree; SVM; KNN; CART; C4.5

Samuel, Nikhil JIdentification of Uniform Class Regions using Perceptron Training
MS, University of Cincinnati, 2015, Engineering and Applied Science: Computer Engineering
Several classification algorithms are in existence and are utilized for different applications to separate and accurately identify various input objects. Most classifiers separate datasets by minimizing the misclassification on both sides of the classifier, but in our approach to maximize the purity of classification on one side of the classifier we minimize the misclassification on that side of the classifier. These classifiers utilize learning algorithms to classify data. Learning algorithms are of two types, one approach is to use an unsupervised learning technique and the other uses supervised learning techniques. Supervised learning are further divided based on algorithms that use reinforcement learning and algorithms that use error correction. Supervised learning with error correction most often than not reaches the optimal solution faster and with a fewer iterations than the reinforcement learning approach. In this work, we propose a novel approach to identify uniform class regions using a linear classifier. We utilize a supervised learning with error correction approach to accurately classify and identify uniform class regions on various datasets. The concept of the perceptron corrective learning algorithm in conjecture with the pocket algorithm is utilized to classify the selected datasets. We introduce a factor Relative weight of error correction (?e) while updating the weight vector for a misclassified instance to identify uniform class regions in the dataset. This parameter is varied between 0.01 and 1.0 to identify the point at which the classifier is able to accurately separate a pure region of data points from the selected dataset. We are able to achieve 100% precision in classification and identification of pure region of points within the selected dataset when a pure region exists. We provide a detailed analysis of our results on these non-linearly separable datasets and confirm that our proposed method is able to accurately identify uniform class regions on various datasets.

Committee:

Raj Bhatnagar, Ph.D. (Committee Chair); Nan Niu, Ph.D. (Committee Member); Paul Talaga, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Perceptron Training;Uniform Class Regions;Classification;Supervised Learning;Error Correction;Linearly non-separable datasets