PHD, Kent State University, 2012, College of Arts and Sciences / Department of Computer Science
This dissertation focuses on developing novel techniques to help understand, analyze, and query large graphs by utilizing backbone structures. Network backbone depicts a core substructure which not only offers concise and highlighted topological view of original graph, but also carries a large amount of essential information, such as information flow. In this dissertation, we introduce a novel backbone structure “gate graph” to provide a simplified topological view of original graph while preserving its distance measure. A graph simplification algorithm based on set cover framework is proposed to extract gate graph. We show theoretically and empirically that both approaches considerably reduce the complexity of original graph in terms of the number of vertices.
In addition, we also examine how to efficiently answer reachability query and shortest path (distance) query on large graphs, by utilizing tailored backbone structures: 1) first, to address scalability challenge in reachability query answering on massive graphs, we extract a “reachability backbone” and then leverage it to help scale the existing reachability indexing algorithms and speed up online query processing approaches; 2) secondly, to efficiently answer distance queries on large directed graphs, we present a novel labeling scheme, referred to as Highway-Centric Labeling, utilizing a directed spanning tree as highway structure. We build an interesting connection between our labeling problem and Bipartite Set Cover problem and propose an elegant algorithm with non-trivial logarithmic bound to solve it; 3) finally, we introduce a novel Hub2-Labeling approach to compute the exact shortest path between two vertices with distance no more than 6 in large social networks. Our approach significantly extends the existing landmark approach by utilizing a large number of hubs to help estimate the distance and to help reduce the search space. We demonstrate these approaches using synthetic datasets and real-world (open full item for complete abstract)
Committee: Ruoming Jin Dr. (Advisor); Feodor Dragan Dr. (Committee Member); Jonathan Maletic Dr. (Committee Member); Robin Selinger Dr. (Committee Member); Almut Schroeder Dr. (Committee Member)
Subjects: Computer Science