Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Thesis BR Final.pdf (1.44 MB)
ETD Abstract Container
Abstract Header
Using Genetic Algorithms for Feature Set Selection in Text Mining
Author Info
Rogers, Benjamin Charles
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=miami1389811705
Abstract Details
Year and Degree
2014, Master of Science, Miami University, Computer Science and Software Engineering.
Abstract
The rationale behind design decisions are often recorded in different project documentation. One way to extract this rationale is by using text mining. Text mining involves data mining over natural language documents. The performance of a text mining system depends on many factors, including the feature sets used. Exhaustive searching for optimal combinations of feature sets is rarely feasible, often leading researchers to make guesses as to which combinations to use. A genetic algorithm is used to find optimal combinations of feature sets for binary rationale, the argumentation subset, the arguments-all subset, decisions, and alternatives. The genetic algorithm uses GATE, WEKA, and a pipeline that allows the automatic passing of information from one to the other. This pipeline is also useable in other text mining contexts. The genetic algorithm produced medium sized feature sets which tended to prefer unigrams and bigrams over 4-grams and 5-grams when compared to random selection.
Committee
Janet Burge, PhD (Advisor)
Dhananjai Rao, PhD (Committee Member)
Michael Zmuda, PhD (Committee Member)
Pages
69 p.
Subject Headings
Artificial Intelligence
;
Computer Engineering
;
Computer Science
;
Information Science
Keywords
genetic algorithms
;
feature set selection
;
text mining
;
design rationale
;
GATE
;
WEKA
;
pipeline
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Rogers, B. C. (2014).
Using Genetic Algorithms for Feature Set Selection in Text Mining
[Master's thesis, Miami University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=miami1389811705
APA Style (7th edition)
Rogers, Benjamin.
Using Genetic Algorithms for Feature Set Selection in Text Mining.
2014. Miami University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=miami1389811705.
MLA Style (8th edition)
Rogers, Benjamin. "Using Genetic Algorithms for Feature Set Selection in Text Mining." Master's thesis, Miami University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=miami1389811705
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
miami1389811705
Download Count:
1,874
Copyright Info
© 2013, all rights reserved.
This open access ETD is published by Miami University and OhioLINK.