Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering
Making sense of, analyzing, and extracting useful information from large and complex data is a grand challenge. A user tasked with meeting this challenge is often befuddled with questions on where and how to begin to understand the relevant characteristics of such data. Recent advances in relational analytics, in particular network analytics, offer key tools for insight into connectivity structure and relationships at both local ("guilt by association") and global (clustering and pattern matching) levels. These tools form the basis of recommender systems, ranking, and learning algorithms of great importance to research and industry alike.
However, complex data rarely originate in a format suitable for network analytics, and the transformation of large and typically high-dimensional non-network data to a network is rife with parameterization challenges, as an under- or over-connected network will lead to poor subsequent analysis. Additionally, both network formation and subsequent network analytics become very computationally expensive as network size increases, especially if multiple networks with different connectivity levels are formed in the previous step; scalable approximate solutions are thus a necessity.
I present an interactive system called PLASMA-HD to address these challenges. PLASMA-HD builds on recent progress in the fields of locality sensitive hashing, knowledge caching, and graph visualization to provide users with the capability to probe and interrogate the intrinsic structure of data. For an arbitrary dataset (vector, structural, or mixed), and given a similarity or distance measure-of-interest, PLASMA-HD enables an end user to interactively explore the intrinsic connectivity or clusterability of a dataset under different threshold criteria. PLASMA-HD employs and enhances the recently proposed Bayesian Locality Sensitive Hashing (BayesLSH), to efficiently estimate connectivity structure among entities. Unlike previous efforts which operate at (open full item for complete abstract)
Committee: Srinivasan Parthasarathy (Advisor); Arnab Nandi (Committee Member); P Sadayappan (Committee Member); Michael Barton (Committee Member)
Subjects: Computer Science