Master of Science (MS), Ohio University, 2007, Electrical Engineering & Computer Science (Engineering and Technology)
The intent of this thesis is to improve on existing algorithms for determining classification rules by reducing the computational time to generate the reducts of an information system. Determining all reducts is an NP (Non-deterministic Polynomial time) complete problem and, therefore, as the data set grows in size, the time required for computation rapidly exceeds what is practical. This thesis has been able to significantly reduce the amount of time it takes to perform these computations. While the problem is still NP complete, the amount of time required by the methods introduced is less than other well-known methods provided by other software packages such as Rosetta [Ohr99] and RSES [RSES2]. Despite the reduct generation time improvements, larger databases still take far too long for effective reduct determination; therefore, heuristic non-exhaustive methods were also evaluated. In practical applications of rough sets, it is important that the obtained reducts retain most of the information about the original problem. In these applications, reducts of a dataset are used as classifiers to determine the “rules” for classification. The second half of this thesis proposes a method for rapidly producing effective classifiers of sufficient quality to get classification results of equal or better quality compared to exhaustive methods. The proposed method gives results that are at, or near, the same quality as those obtained from using the exhaustive method in only a fraction of the computational time.
Committee: Janusz Starzyk (Advisor)
Subjects: