Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Efficient Algorithms to Compute Topological Entities

Abstract Details

2021, Doctor of Philosophy, Ohio State University, Computer Science and Engineering.
Topological Data Analysis is a rapidly growing field to extract and analyze useful topological features from especially difficult datasets. Topological features, with their wide application, provide us a richer and more robust view of shapes compared to geometric features. During recent decades, the problem of computing the topological entities of a space has drawn much attention. In this thesis, I present my focused research on two topological entities: minimal homology basis of simplicial complexes and persistent homology of directed graphs. In the first part of this thesis, we focus on efficient computation of shortest cycles which form a homology basis under $\mathbb{Z}_2$-additions in a given simplicial complex $\mathcal{K}$. This problem has been researched actively in recent years. The complexity of these algorithms depends on the rank $g$ of the one-dimensional homology group of $\mathcal{K}$. This rank $g$ has a lower bound of $\Theta(n)$, where $n$ denotes the number of simplices in $\mathcal{K}$. When $g=\Theta(n)$, the current best algorithm, as far as we know, gives an $O(n^{\omega+1})$ worst-case time complexity where $\omega<2.373$ is the fast matrix multiplication exponent. In this work, we improved this time complexity. Combining the idea of annotations from \cite{annotation} and the divide and conquer technique from \cite{DivideConquer}, we present an algorithm that runs in $O(n^\omega+n^2g)$ time giving the first $O(n^3)$ worst-case algorithm for general complexes. If instead of minimal basis, we settle for approximate basis, we can improve the running time even further. We show that a $2$-approximation to the minimal basis can be computed in $O(n^{\omega}\sqrt{n\log n})$ worst-case time. We also study more general measures for defining the minimal basis and identify reasonable conditions on these measures that allow computing a minimal basis efficiently. We extend our approach to higher dimensional cycles where we improve the running time for computing a minimum homology basis from $O(\beta_dn^4)$ to $O(n^{\omega+1})$ under a size function defined in \cite{chen2010measuring} where $\beta_d$ is the rank of $d$-dimensional homology group. The second part of this thesis concerns persistent homology of directed networks, including two works. While the concepts of homology groups and their persistence are well suited for spaces that are not directed, path-homology, developed for directed graphs by Grigor’yan, Lin, Muranov and Yau, has been effectively adapted to its persistence by Chowdhury and M{\'e}moli recently. They also give an algorithm to compute the path-homology and the persistent path-homology. Our main contribution in the first work of this part is an algorithm that computes this path-homology and its persistence more efficiently for the $1$-dimensional ($H_1$) case. In developing such an algorithm, we discover various structures and their efficient computations that aid computing the $1$-dimensional path-homnology. We implement our algorithm and present some preliminary experimental results. In the second work of this part, we adopt persistent homology of directed graphs to network models which play an essential role in analyzing network data in many areas as they capture extensive dependencies among nodes (actors). Given a network model, a challenge that ensues involves measuring the fitness of the model to its observed network. For directed networks, existing approaches often ignore the global structure behind the network, partly due to the inability of encoding such information. To fill this gap, in this work, we introduce a fitness measure based on persistent homology which can summarize global topological features behind complex data. Specifically, given a network model and an observed network, we develop a new fitness score of the network model, leveraging any persistent homology for directed networks. For this we measure the improvement of the model over a null model w.r.t. a persistence-based distance between networks sampled from the model and the observed network. We provide empirical evidence to demonstrate the effectiveness of our new topological fitness measure.
Rephael Wenger (Advisor)
Yusu Wang (Committee Member)
Tamal Dey (Committee Member)
Pooya Hatami (Committee Member)
109 p.

Recommended Citations

Citations

  • Li, T. (2021). Efficient Algorithms to Compute Topological Entities [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1608256383277208

    APA Style (7th edition)

  • Li, Tianqi. Efficient Algorithms to Compute Topological Entities. 2021. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1608256383277208.

    MLA Style (8th edition)

  • Li, Tianqi. "Efficient Algorithms to Compute Topological Entities." Doctoral dissertation, Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1608256383277208

    Chicago Manual of Style (17th edition)