Search ETDs:
AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING
VANCE, DANNY W.

2006, PhD, University of Cincinnati, Engineering : Computer Science and Engineering.
The objective of supervised learning is to estimate unknowns based on labeled training samples. For example, one may have aerial spectrographic readings for a large field planted in corn. Based on spectrographic observation, one would like to determine whether the plants in part of the field are weeds or corn. Since the unknown to be estimated is categorical or discrete, the problem is one of classification. If the unknown to be estimated is continuous, the problem is one of regression or numerical estimation. For example, one may have samples of ozone levels from certain points in the atmosphere. Based on those samples, one would like to estimate the ozone level at other points in the atmosphere. Algorithms for supervised learning are useful tools in many areas of agriculture, medicine, and engineering, including estimation of proper levels of nutrients for cows, prediction of malignant cancer, document analysis, and speech recognition. A few general references on supervised learning include [1], [2], [3], and [4]. Two recent reviews of the supervised learning literature are [5] and [6]. In general, univariate learning tree algorithms have been particularly successful in classification problems, but they can suffer from several fundamental difficulties, e.g., "a representational limitation of univariate decision trees: the orthogonal splits to the feature's axis of the sample space that univariate tree rely on" [8] and overfit [17]. In this thesis, we present a classification procedure for supervised classification that consists of a new univariate decision tree algorithm (Margin Algorithm) and two other related algorithms (Hyperplane and Box Algorithms). The full algorithm overcomes all of the usual limitations of univariate decision trees and is called the Paired Planes Classification Procedure. The Paired Planes Classification Procedure is compared to Support Vector Machines, K-Nearest Neighbors, and decision trees. The Hyperplane Algorithm allows direct user input as to acceptable error for each class as contrasted with indirect input (through use of a slack variable) with Support Vector Machines. Theoretical and real-life datasets results are shown. Experiments on real-life datasets show that error rates are in some circumstances lower than these supervised learning algorithms, while usually being computationally less expensive by an order of magnitude (or more).
Dr. Anca Ralescu (Advisor)
201 p.

Recommended Citations

Hide/Show APA Citation

VANCE, D. (2006). AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING. (Electronic Thesis or Dissertation). Retrieved from https://etd.ohiolink.edu/

Hide/Show MLA Citation

VANCE, DANNY. "AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING." Electronic Thesis or Dissertation. University of Cincinnati, 2006. OhioLINK Electronic Theses and Dissertations Center. 01 Apr 2015.

Hide/Show Chicago Citation

VANCE, DANNY "AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING." Electronic Thesis or Dissertation. University of Cincinnati, 2006. https://etd.ohiolink.edu/

Files

ucin1162335608.pdf (3.7 MB) View|Download