Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
21001.pdf (5.3 MB)
ETD Abstract Container
Abstract Header
Multi-Domain Clustering using the A* Search
Author Info
Gurram, Abhinav
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379
Abstract Details
Year and Degree
2016, MS, University of Cincinnati, Engineering and Applied Science: Computer Science.
Abstract
Identification of interesting bi-clusters in real-valued datasets is a computationally hard problem. The problem is not easily scalable with the increasing size of the data sets. Most of the emerging and interesting data mining problems are encountering data sets of increasingly larger sizes and complexity. For finding interesting bi-clusters in a dataset we need to examine all possible subsets of rows and columns and determine the bi-clusters that meet some interestingness criteria. When there exist monotonic or anti-monotonic properties of bi-clusters that increase or decrease with the sizes of the subsets forming bi-clusters, apriori kind of pruning can be used to speed up the search for the interesting bi-clusters. But such monotonic properties are not easily available for most mining tasks. Another useful avenue for pruning the search is the requirement for the bi-clusters to be consistent with the data residing in a second dataset. As the hypotheses for bi-clusters are examined by a search algorithm, their merit value is influenced by their consistency with the data in the second dataset. We have developed and tested one such heuristic search based 3-clustering algorithm and its details are presented in this thesis. We have successfully demonstrated that a number of different heuristics can be used to identify clusters having different types of properties, especially when these properties are derived from two datasets storing different types of information about the same sets of row objects. Performance of the algorithm and quality of clusters discovered has been studied in detail and results have been enumerated. Our conclusion is that search based algorithms are applicable for identification of interesting bi-clusters & 3-clusters in situations of multiple related datasets.
Committee
Raj Bhatnagar (Committee Chair)
Yizong Cheng (Committee Member)
Paul Talaga (Committee Member)
Pages
133 p.
Subject Headings
Engineering
Keywords
clustering
;
bi-clustering
;
3-clusters
;
heuristic seach
;
A star search
;
multi-domain clustering
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Gurram, A. (2016).
Multi-Domain Clustering using the A* Search
[Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379
APA Style (7th edition)
Gurram, Abhinav.
Multi-Domain Clustering using the A* Search.
2016. University of Cincinnati, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379.
MLA Style (8th edition)
Gurram, Abhinav. "Multi-Domain Clustering using the A* Search." Master's thesis, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
ucin1470671379
Download Count:
313
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by University of Cincinnati and OhioLINK.