Skip navigation

Search ETDs:

More Like This | More search options

Export: Refworks Refworks | RIS

A Framework for Sampling Pattern Occurrences in a Huge Graph

PDF Display Full Text | Download Full Text
0.86 MB PDF file

Degree
Master of Sciences, Case Western Reserve University, EECS - Computer and Information Sciences, .
Abstract
In many applications, e.g., computational biology, software engineering, social networks, etc., a large amount of data can be represented as huge graphs. Discovery of occurrences of small patterns in these graphs is an important task. The number of pattern occurrences can be very large, which leads to two potential problems: 1) the execution time required to find all occurrences may be very long; 2) it may be very time consuming for end users to process the discovered occurrences. In addition, many applications do not require the discovery of all occurrences; a random sample is sufficient. In this paper, we propose the SALTY framework which can find random samples according to four different definitions of "randomness". It can not only reduce the execution time significantly, but it also produces results closely representing the distribution of all occurrences. Lastly, real and synthetical data sets are utilized to demonstrate the effectiveness and efficiency of the SALTY framework.
Subject Headings
Computer science
Keywords
Graph; subgraph matching; occurrence estimation; occurrence sampling
Committee / Advisors
Jiong Yang (Committee Chair)
Andy Podgurski (Committee Member)
Soumya Ray (Committee Member)

Document number: case1269979693
Permalink:

This ETD has been downloaded 143 times (through March 2013)