Skip to Main Content
 

Global Search Box

 
 
 
 

Files

File List

ETD Abstract Container

Abstract Header

Multi-Domain Clustering using the A* Search

Gurram, Abhinav

Abstract Details

2016, MS, University of Cincinnati, Engineering and Applied Science: Computer Science.
Identification of interesting bi-clusters in real-valued datasets is a computationally hard problem. The problem is not easily scalable with the increasing size of the data sets. Most of the emerging and interesting data mining problems are encountering data sets of increasingly larger sizes and complexity. For finding interesting bi-clusters in a dataset we need to examine all possible subsets of rows and columns and determine the bi-clusters that meet some interestingness criteria. When there exist monotonic or anti-monotonic properties of bi-clusters that increase or decrease with the sizes of the subsets forming bi-clusters, apriori kind of pruning can be used to speed up the search for the interesting bi-clusters. But such monotonic properties are not easily available for most mining tasks. Another useful avenue for pruning the search is the requirement for the bi-clusters to be consistent with the data residing in a second dataset. As the hypotheses for bi-clusters are examined by a search algorithm, their merit value is influenced by their consistency with the data in the second dataset. We have developed and tested one such heuristic search based 3-clustering algorithm and its details are presented in this thesis. We have successfully demonstrated that a number of different heuristics can be used to identify clusters having different types of properties, especially when these properties are derived from two datasets storing different types of information about the same sets of row objects. Performance of the algorithm and quality of clusters discovered has been studied in detail and results have been enumerated. Our conclusion is that search based algorithms are applicable for identification of interesting bi-clusters & 3-clusters in situations of multiple related datasets.
Raj Bhatnagar (Committee Chair)
Yizong Cheng (Committee Member)
Paul Talaga (Committee Member)
133 p.

Recommended Citations

Citations

  • Gurram, A. (2016). Multi-Domain Clustering using the A* Search [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379

    APA Style (7th edition)

  • Gurram, Abhinav. Multi-Domain Clustering using the A* Search. 2016. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379.

    MLA Style (8th edition)

  • Gurram, Abhinav. "Multi-Domain Clustering using the A* Search." Master's thesis, University of Cincinnati, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470671379

    Chicago Manual of Style (17th edition)