Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 14)

Mini-Tools

 
 

Search Report

  • 1. Aqeel, Aya EVIDENCE BASED MEDICAL QUESTION ANSWERING SYSTEM USING KNOWLEDGE GRAPH PARADIGM

    Master of Science in Software Engineering, Cleveland State University, 2022, Washkewicz College of Engineering

    Evidence Based Medicine (EBM) is the process of systematically finding, judging, and using research findings as the basis for clinical decisions and has become the standard of medical practice. There are countless new studies and research being published daily. Keeping track of each of them is impossible, not to mention needing to read and comprehend them. While search engines can help healthcare professionals search for a topic with suggesting relevant papers on the topic, healthcare professionals still need to go through the papers and extract relevant information themselves. This is a very time-consuming task as one study on Information Retrieval (IR) practices of healthcare information professionals that it takes on average 4 hours for healthcare information professionals to finish a search task. Moreover, a systematic review study on the barriers to medical residents' practicing of evidence-based medicine revealed that two of the most frequently mentioned barriers for residents were limitations in available time, knowledge, and skills. In this project, we address both problems by building a Medical Question Answering (QA) system that employees semi-supervised information extraction methods in Natural Language Processing (NLP) to construct a large scale Knowledge Graph (KG) from the extracted facts from a large repository of medical research publications. Then, the system translates a given user's question in a natural language to the KG efficiently to extract relevant answers based on evidences to present in a user-friendly manner. The system returns a compilation of summaries for the related evidences with one sentence summary for each evidence relevant to the user's question and the reference to the full publication. The system can help address the barriers of knowledge and skills by providing comprehensive summary of the evidences for a given question in a natural language that eliminates the need to formulate complex structured queries. The system was evalu (open full item for complete abstract)

    Committee: Sunnie Chung (Committee Chair); Satish Kumar (Committee Member); Yongjian Fu (Committee Member); Sunnie Chung (Advisor) Subjects: Artificial Intelligence; Biomedical Research; Medicine
  • 2. Yaddanapudi, Suryanarayana Machine Learning Based Drug-Disease Relationship Prediction and Characterization

    PhD, University of Cincinnati, 2019, Engineering and Applied Science: Mechanical Engineering

    Drug repurposing is the process of finding novel uses for approved or failed drugs. Recently, several computational approaches coupled with big-data analytics are making it possible to systematically and rapidly evaluate the repurposing opportunities in an automated fashion. While some approaches focus on matching drug and disease gene expression profiles, the others rely on the interaction between protein targets and integrated mechanistic relationships from molecular to system level into computational drug repurposing candidate discovery platforms. The work done for my thesis aims at characterizing the drug-disease associations using computational framework through drug related side-effects, shared phenotypes and gene feature annotations. In this regard, I proposed three approaches. In the first approach, I built a drug-drug interactome based on side-effects data and predicted drug-disease interactions using graph-based clustering algorithms (Chapter 3). In the second approach called PhenoRx, I ranked drug-disease associations based on shared phenotypes using cosine similarity integrated with term frequency–inverse document frequency and identify novel drug-disease relationships (Chapter 4). Finally, in the third approach called FeatuRx, I matched drugs and diseases based on the incidence of shared features like pathways, biological processes, and phenotypes and implemented machine learning classifiers to discover potentially novel drug-disease associations (Chapter 5). I validated the three machine learning-based approaches using different performance metrics and the results obtained suggest that these approaches can be useful in drug discovery, drug repurposing, and pharmacovigilance studies.

    Committee: Anil Jegga D.V.M. (Committee Chair); Sam Anand Ph.D. (Committee Member); Samuel Huang Ph.D. (Committee Member); Mayur Sarangdhar PhD (Committee Member); David Thompson Ph.D. (Committee Member) Subjects: Bioinformatics
  • 3. Robbeloth, Michael Recognition of Incomplete Objects based on Synthesis of Views Using a Geometric Based Local-Global Graphs

    Doctor of Philosophy (PhD), Wright State University, 2019, Computer Science and Engineering PhD

    The recognition of single objects is an old research field with many techniques and robust results. The probabilistic recognition of incomplete objects, however, remains an active field with challenging issues associated to shadows, illumination and other visual characteristics. With object incompleteness, we mean missing parts of a known object and not low-resolution images of that object. The employment of various single machine-learning methodologies for accurate classification of the incomplete objects did not provide a robust answer to the challenging problem. In this dissertation, we present a suite of high-level, model-based computer vision techniques encompassing both geometric and machine learning approaches to generate probabilistic matches of objects with varying degrees and forms of non-deformed incompleteness. The recognition of incomplete objects requires the formulation of a database of six sided views (e.g., model) of an object from which an identification can be made. The images are preprocessed (K-means segmentation, and region growing code to generate fully defined region and segment image information) from which local and global geometric and characteristic properties are generated in a process known as the Local-Global (L-G) Graph method. The characteristic properties are then stored into a database for processing against sample images featuring various types of missing features. The sample images are then characterized in the same manner. After this, a suite of methodologies is employed to match a sample against an exemplar image in a multithreaded manner. The approaches, which work with the multi-view model database characteristics in a parallel (e.g, multithreaded manner) determine probabilistically by application of weighted outcomes the application of various matching routines. These routines include treating segment border regions as chain codes which are then processed using various string matching algorithms, the matching by center (open full item for complete abstract)

    Committee: Nikolaos G. Bourbakis Ph.D. (Advisor); Soon M. Chung Ph.D. (Committee Member); Yong Pei Ph.D. (Committee Member); Arnab K. Shaw Ph.D. (Committee Member) Subjects: Computer Science
  • 4. Hong, Changwan Code Optimization on GPUs

    Doctor of Philosophy, The Ohio State University, 2019, Computer Science and Engineering

    Graphic Processing Units (GPUs) have become popular in the last decade due to their high memory bandwidth and powerful computing capacity. Nevertheless, achieving high-performance on GPUs is not trivial. It generally requires significant programming expertise and understanding of details of low-level execution mechanisms in GPUs. This dissertation introduces approaches for optimizing regular and irregular applications. To optimize regular applications, it introduces a novel approach to GPU kernel optimization by identifying and alleviating bottleneck resources. This approach, however, is not effective in irregular applications because of data-dependent branches and memory accesses. Hence, tailored approaches are developed for two popular domains of irregular applications: graph algorithms and sparse matrix primitives. Performance modeling for GPUs is carried out by abstract kernel emulation along with latency/gap modeling of resources. Sensitivity analysis with respect to resource latency/gap parameters is used to predict the bottleneck resource for a given kernel's execution. The utility of the bottleneck analysis is demonstrated in two contexts: i) Enhancing the OpenTuner auto-tuner with the new bottleneck-driven optimization strategy. Effectiveness is demonstrated by experimental results on all kernels from the Rodinia suite and GPU tensor contraction kernels from the NWChem computational chemistry suite. ii) Manual code optimization. Two case studies illustrate the use of a bottleneck analysis to iteratively improve the performance of code from state-of-the-art DSL code generators. However, the above approach is ineffective for irregular applications such as graph algorithms and sparse linear systems. Graph algorithms are used in various applications, and high-level GPU graph processing frameworks are an attractive alternative for achieving both high productivity and high-performance. This dissertation develops an approach to graph processing on GPUs (open full item for complete abstract)

    Committee: Ponnuswamy Sadayappan (Advisor); Rountev Atanas (Committee Member); Teodorescu Radu (Committee Member) Subjects: Computer Science
  • 5. Kumar, Lalit Scalable Map-Reduce Algorithms for Mining Formal Concepts and Graph Substructures

    PhD, University of Cincinnati, 2018, Engineering and Applied Science: Computer Science and Engineering

    With evolution in distributed processing environment, many new algorithms have been pro- posed in order to overcome efficiency and scalability issues. Most of the proposed algorithms try to solve these issues by distributing data across multiple nodes in cluster and process the data using existing sequential approaches. Even though the distribution and processing of data using multiple node cluster has eliminated system resource constraints of sequential approaches but failed to address other major issues such as load balancing, duplicate data processing as well as effective partitioning of large data into independent smaller chunks for maximum scalability and efficiency. In the proposed algorithms for Formal Concept Analysis, Graph Processing and Real-Valued Bicluster generation, we have leveraged the power of multiple node cluster to eliminate re- source constraints. We have also addressed the issues of load balancing, duplicate data processing and data partitioning very efficiently. Because of efficient load balancing and data partitioning along with elimination of duplicate data processing, we are able to process same volume of data as of existing algorithms in either same or lesser time using around 1/10 th of the resources. In order to solve Formal Concept Analysis problem in distributed environment, unlike existing iterative approaches we have first generated Sufficient Set using single iteration of map-reduce. We have also demonstrated that the generated Sufficient Set is 2-hop projection of data and contains all the information needed to enumerate entire lattice. This 2-hop projection of data enabled our approach in effective load balancing and data partitioning. Since the generated sufficient set is much smaller in size compared to the original input data, it is processed on stand alone machine to selectively enumerate parts of lattice as well as entire lattice as per the requirement. In second and third problem of processing large gr (open full item for complete abstract)

    Committee: Raj Bhatnagar Ph.D. (Committee Chair); Gowtham Atluri Ph.D. (Committee Member); Yizong Cheng Ph.D. (Committee Member); Anil Jegga B.V.Sc (Committee Member); Ali| Minai Ph.D. (Committee Member) Subjects: Computer Science
  • 6. Manglani, Heena A neural network analysis of sedentary behavior and information processing speed in multiple sclerosis

    Master of Arts, The Ohio State University, 2018, Psychology

    People with multiple sclerosis (PwMS) experience deficits in information processing speed, which underlie higher-level cognitive difficulties and negatively impact activities of daily living. While considerable research on physical activity indicates its benefits on cognitive health, there is growing evidence that sedentary behavior, or sitting, may be detrimental to health independent of engagement in moderate-to-vigorous physical activity (MVPA). As greater sitting is linked to increased risk for several adverse clinical conditions, it may also be associated with poorer cognitive function. One mechanism by which sitting time may influence speed of information processing is through its influence on neural network functioning. The current study elucidated the relationship between sedentary behavior, processing speed, and global information transfer in neural networks in a sample with relapsing-remitting multiple sclerosis. We found a negative association between sedentary behavior and processing speed while controlling for MVPA, and covariates. We did not find global efficiency to be associated with sedentary behavior, processing speed, or mediate the relationship between sedentary behavior and processing speed, while holding constant MVPA, disease severity, and additional covariates. This research offers support for sedentary behavior as an important and viable target for intervention, and establishes the groundwork for further probing of neural network function in PwMS.

    Committee: Ruchika Prakash Ph.D. (Advisor); Charles Emery Ph.D. (Committee Member); John Corrigan Ph.D. (Committee Member) Subjects: Clinical Psychology; Neurosciences; Psychology
  • 7. He, Xin On efficient parallel algorithms for solving graph problems /

    Doctor of Philosophy, The Ohio State University, 1987, Graduate School

    Committee: Not Provided (Other) Subjects: Computer Science
  • 8. Abu Doleh, Anas High Performance and Scalable Matching and Assembly of Biological Sequences

    Doctor of Philosophy, The Ohio State University, 2016, Electrical and Computer Engineering

    Next Generation Sequencing (NGS), the massive parallel and low-cost sequencing technology, is able to generate an enormous size of sequencing data. This facilitates the discovery of new genomic sequences and expands the biological and medical research. However, these big advancements in this technology also bring big computational challenges. In almost all NGS analysis pipelines, the most crucial and computationally intensive tasks are sequence similarity searching and de novo genome assembly. Thus, in this work, we introduced novel and efficient techniques to utilize the advancements in the High Performance Computing hardware and data computing platforms in order to accelerate these tasks while producing high quality results. For the sequence similarity search, we have studied utilizing the massively multithreaded architectures, such as Graphical Processing Unit (GPU), in accelerating and solving two important problems: reads mapping and maximal exact matching. Firstly, we introduced a new mapping tool, Masher, which processes long~(and short) reads efficiently and accurately. Masher employs a novel indexing technique that produces an index for huge genome, such as the human genome, with a small memory footprint such that it could be stored and efficiently accessed in a restricted-memory device such as a GPU. The results show that Masher is faster than state-of-the-art tools and obtains a good accuracy and sensitivity on sequencing data with various characteristics. Secondly, maximal exact matching problem has been studied because of its importance in detection and evaluating the similarity between sequences. We introduced a novel tool, GPUMEM, which efficiently utilizes GPU in building a lightweight indexing and finding maximal exact matches inside two genome sequences. The index construction is so fast that even by including its time, GPUMEM is faster in practice than state-of-the-art tools that use a pre-built index (open full item for complete abstract)

    Committee: Umit Catalyurek (Advisor); Kun Huang (Committee Member); Fusun Ozguner (Committee Member) Subjects: Bioinformatics; Computer Engineering
  • 9. Sedaghati Mokhtari, Naseraddin Performance Optimization of Memory-Bound Programs on Data Parallel Accelerators

    Doctor of Philosophy, The Ohio State University, 2016, Computer Science and Engineering

    High performance applications depend on high utilization of memory bandwidth and computing resources, and data parallel accelerators have proven to be very effective in providing both, when needed. However, memory bound programs push the limits of system bandwidth, causing under-utilization in computing resources and thus energy inefficient executions. The objective of this research is to investigate opportunities on data parallel accelerators (i.e., SIMD units and GPUs) and design solutions for improving the performance of three classes of memory-bound applications: stencil computation, sparse matrix-vector multiplication (SpVM) and graph analytics. This research first focuses on performance bottlenecks of stencil computations on short-vector SIMD ISAs and presents StVEC, a hardware-based solution for extending the vector ISA and improving data movement and bandwidth utilization. StVEC includes an extension to the standard addressing mode of vector floating-point instructions in contemporary vector ISAs (e.g. SSE, AVX, VMX). A code generation approach is designed and implemented to help a vectorizing compiler generate code for processors with StVEC extensions. Using an optimistic as well as a pessimistic emulation of the proposed StVEC instructions, it is shown that the proposed solution can be effective on top of SSE and AVX capable processors. To analyze hardware overhead, parts of the proposed design are synthesized using a 45nm CMOS library and shown to have minimal impact on processor cycle time. As the second class of memory-bound programs, this research has focused on sparse matrix-vector multiplications (SpMV) on GPUs and shown that no sparse matrix representation is consistently superior, with the best representation being dependent on the matrix sparsity patterns. This part focuses on four standard sparse representations (i.e. CSR, ELL, COO and a hybrid ELL-COO) and studies the correlations between SpMV performance and the sparsity features. The res (open full item for complete abstract)

    Committee: Ponnuswamy Sadayappan (Advisor); Louis-Noel Pouchet (Committee Member); Mircea-Radu Teodorescu (Committee Member); Atanas Ivanov Rountev (Committee Member) Subjects: Computer Engineering; Computer Science; Engineering
  • 10. Faisal, S M Towards Energy Efficient Data Mining & Graph Processing

    Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering

    Ever increasing energy cost is one of the most critical concerns for large scale deployments of data centers. As the demand for large scale data processing increases, it is paramount that energy efficiency is taken into account for designing architectures as well as algorithms for large scale data processing. While cost is a critical issue, it is not the only point of interest; Increased energy consumption has severe impact on the environment. Hence, it is important to pay close attention towards energy efficient data mining and graph processing algorithms that leverage architectural as well as algorithmic features to reduce energy consumption while serving respective purposes with a reduced carbon footprint. In this work, we take a close look at energy efficiency in the broad area of data mining and graph processing and approach the problem from multiple fronts. First, we take a pure software centric approach where we focus on developing frameworks that provide faster solutions to problems that are expensive otherwise and save energy thereby – following the race-to-halt phenomenon. Our proposed framework allows space efficient representation, scalable distributed processing and ease of programming for large, power law graphs. We also develop parallel, distributed implementations of a popular graph clustering algorithm, Regularized Markov Clus- tering (RMCL), on various distributed memory programming frameworks. Next we analyze commonly used data mining, multimedia and graph clustering algorithms to explore their energy profile and tolerance to random bit errors induced by low voltage computation. At the core of any research on energy efficient, low voltage computing is reliable error models for functional units at low voltage. We find that existing models lack sufficient detail and fail to capture the behavior in a realistic manner. Driven by the necessity, we propose a set of accurate, robust and realistic models for functional units' behavior at low voltage. Fin (open full item for complete abstract)

    Committee: Srinivasan Parthasarathy (Advisor); P. Sadayappan (Committee Member); Radu Teodorescu (Committee Member) Subjects: Computer Science
  • 11. Kurt, Mehmet Fault-tolerant Programming Models and Computing Frameworks

    Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering

    Fault-tolerance on parallel systems has always been a big challenge for High Performance Computing (HPC), and hence it has drawn a lot of attention of the community. This pursuit in fault-tolerant systems is now important more than ever due to the recent advances in hardware. As the emergence of first multi-core and more recently many-core machines evince, computing power is constantly being increased with more number of processing cores resulting in more parallelism. In order to satisfy this demand and to increase power of individual components, chips are manufactured with decreasing feature sizes. Another trend is power optimization efforts, since it might not be feasible to run all system resources at their peak levels all the time due to the factors such as heat dissipation and maintaining a total power budget. These trends in hardware also change the way that scientific applications are implemented. The community designs new and diverse parallel programming models to harvest the available computing power in new hardware architectures. These models provide additional support to programmers so that they can achieve scalable performance by tuning applications via additional API, specifications or annotations. Unfortunately, these changes in hardware and software also bring new challenges. For instance, increasing number of components in HPC systems results in increasing probability of failure at the same time. Trends such as decreasing feature sizes and low voltage computing cause more frequent bit flip occurrences. Lastly, when incorporated incorrectly or inaccurately, programmer specifications for performance tuning might cause potential errors during execution. Considering these new problems, the community foresees that Mean Time Between Failures (MTBF) rates in the future are destined to decrease significantly so that the current fault-tolerance solutions will become completely inapplicable. In this dissertation, we introduce fault-tolerance solutions in t (open full item for complete abstract)

    Committee: Gagan Agrawal (Advisor); Saday Sadayappan (Committee Member); Radu Teodorescu (Committee Member) Subjects: Computer Science
  • 12. Althuru, Dharan Kumar Reddy Distributed Local Trust Propagation Model and its Cloud-based Implementation.

    Master of Science (MS), Wright State University, 2014, Computer Science

    World Wide Web has grown rapidly in the last two decades with user generated content and interactions. Trust plays an important role in providing personalized content recommendations and in improving our confidence in various online interactions. We review trust propagation models in the context of social networks, semantic web, and recommender systems. With an objective to make trust propagation models more flexible, we propose several extensions to the trust propagation models that can be implemented as configurable parameters in the system. We implement Local Partial Order Trust (LPOT) model that considers trust as well as distrust ratings and perform evaluation on Epinions.com dataset to demonstrate the improvement in recommendations obtained by incorporating trust models. We also evaluate in terms of performance of trust propagation models and motivate the need for scalable solution. In addition to variety, real world applications need to deal with volume and velocity of data. Hence, scalability and performance are extremely important. We review techniques for large-scale graph processing, and propose distributed trust aware recommender architectures that can be selected based on application needs. We develop distributed local partial order trust model compatible with Pregel (a system for large-scale graph processing), and implement it using Apache Giraph on a Hadoop cluster. This model computes trust inference ratings for all users accessible within configured depth from all other users in the network in parallel. We provide experimental results illustrating the scalability of this model with number of nodes in the cluster as well as the network size. This enables applications operating on large-scale to integrate with trust propagation models.

    Committee: Krishnaprasad Thirunarayan Ph.D. (Advisor); Keke Chen Ph.D. (Committee Member); Meilin Liu Ph.D. (Committee Member) Subjects: Computer Science
  • 13. Liu, Yufan A Survey Of Persistent Graph Databases

    MS, Kent State University, 2014, College of Arts and Sciences / Department of Computer Science

    Graph database has attracted increasing attention from both of the database and data mining/machine learning communities. Enormous kinds of data with complex and dynamic relationships can be efficiently expressed by graph structure. Certain techniques such as scoring, shortest path and clustering can provide information and services by leveraging those data. They are widely used in areas like Web graph mining (Google), social network analysis (Facebook), User/Product recommendation (Netflix/Amazon), chemical and biological analysis, etc. Graph databases provide a fast and efficient way to store, access and analysis those kinds of data than any other database system. This thesis will go over the graph data and its representations, then categorize some of the most commonly used graph databases by their storage behavior. Then we will introduce some of the state-of-the-art techniques that enpower the graph database internally. After the introduction, we will have the study on how to access the graph database through query languages or APIs, and compare them through different aspects. Another contribution of this work is to compare the performance to process different kinds of data between the persistent graph databases. Not only the performance of batch loading process has been compared, we also have the stress test on single transactional insertion, query, and the in-memory graph algorithm comparison. At last we have some recommendations of using the databases based on the experiment result.

    Committee: Ruoming Jin (Advisor) Subjects: Computer Engineering; Computer Science
  • 14. Gadde, Srimanth Graph Partitioning Algorithms for Minimizing Inter-node Communication on a Distributed System

    Master of Science in Electrical Engineering, University of Toledo, 2013, College of Engineering

    Processing large graph datasets represents an increasingly important area in computing research and applications. The size of many graph datasets has increased well beyond the processing capacity of a single computing node, thereby necessitating distributed approaches. As these datasets are processed over a distributed system of nodes, this leads to an inter-node communication cost problem (also known as inter-partition communication), negatively affecting the system performance. This research proposes new graph partitioning algorithms to minimize the inter-node communication by achieving a sufficiently balanced partition. Initially, an intuitive graph partitioning algorithm using Random Selection method coupled with Breadth First Search is developed for reducing inter-node communication by achieving a sufficiently balanced partition. Second, another graph partitioning algorithm is developed using Particle Swarm Optimization with Breadth First Search to reduce inter-node communication further. Simulation results demonstrate that the inter-node communication using PSO with BFS gives better results (reduction of approximately 6% to 10% more) compared to the RS method with BFS. However, both the algorithms minimize the inter-node communication efficiently in order to improve the performance of a distributed system.

    Committee: Robert Green (Committee Chair); Vijay Devabhaktuni (Committee Co-Chair); William Acosta (Committee Member); Mansoor Alam (Committee Member) Subjects: Computer Engineering; Computer Science