Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 10)

Mini-Tools

 
 

Search Report

  • 1. Nguyen, Vinh Thi Kim Semantic Web Foundations for Representing, Reasoning, and Traversing Contextualized Knowledge Graphs

    Doctor of Philosophy (PhD), Wright State University, 2017, Computer Science and Engineering PhD

    Semantic Web technologies such as RDF and OWL have become World Wide Web Consortium (W3C) standards for knowledge representation and reasoning. RDF triples about triples, or meta triples, form the basis for a contextualized knowledge graph. They represent the contextual information about individual triples such as the source, the occurring time or place, or the certainty. However, an efficient RDF representation for such meta-knowledge of triples remains a major limitation of the RDF data model. The existing reification approach allows such meta-knowledge of RDF triples to be expressed in RDF by using four triples per reified triple. While reification is simple and intuitive, this approach does not have a formal foundation and is not commonly used in practice as described in the RDF Primer. This dissertation presents the foundations for representing, querying, reasoning and traversing the contextualized knowledge graphs (CKG) using Semantic Web technologies. A triple-based compact representation for CKGs. We propose a principled approach and construct RDF triples about triples by extending the current RDF data model with a new concept, called singleton property (SP), as a triple identifier. The SP representation needs two triples to the RDF datasets and can be queried with SPARQL. A formal model-theoretic semantics for CKGs. We formalize the semantics of the singleton property and its relationships with the triple it represents. We extend the current RDF model-theoretic semantics to capture the semantics of the singleton properties and provide the interpretation at three levels: simple, RDF, and RDFS. It provides a single interpretation of the singleton property semantics across applications and systems. A sound and complete inference mechanism for CKGs. Based on the semantics we propose, we develop a set of inference rules for validating and inferring new triples based on the SP syntax. We also derive different sets of context-based inference rules using latti (open full item for complete abstract)

    Committee: Amit Sheth Ph.D. (Advisor); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Olivier Bodenreider Ph.D. (Committee Member); Kemafor Anyanwu Ph.D. (Committee Member); Ramanathan Guha Ph.D. (Committee Member) Subjects: Computer Science
  • 2. Dave, Brandon Understanding Impact of Graph Structure on Knowledge Graph Embedding

    Master of Science (MS), Wright State University, 2024, Computer Science

    The effectiveness of a deployed knowledge graph is commonly evaluated with defined use-cases from domain experts. This poses challenges during the development cycle in determining how to represent data. Developers of a knowledge graph can optionally include semantics into a knowledge graph by abstracting the data representation in such a way that mirrors information as it exists in the real world. Consequently, the abstraction is represented by additional layers, resulting in performant differences in knowledge graph embedding; such as, the embedded model's ability to infer facts between entities through link predictions. This thesis presents a comprehensive analysis of the performance impact observed across a range of knowledge graph embedding models trained on FB15k-237, a widely recognized benchmark dataset for knowledge graph completion. Additionally, the experiment is performed with augmented versions of FB15k-237, serving to introduce semantics into the knowledge graph.

    Committee: Cogan Shimizu Ph.D. (Advisor); Wen Zhang Ph.D. (Committee Member); Lingwei Chen Ph.D. (Committee Member) Subjects: Computer Science
  • 3. Christou, Antrea Improving Knowledge Graph Understanding with Contextual Views

    Master of Science (MS), Wright State University, 2024, Computer Science

    Knowledge Graphs (KGs) leverage structured data (entities and their relationships) to create a richly interconnected world. However, to fully explore these intricate connections, sophisticated exploration tools are essential as manual exploration can become overwhelming. Applications for KG exploration span many use-cases: social network analysis, corporate intelligence, and medical research. This research improves the InK Browser (Interactive Knowledge Browser), a modular, web-based tool for KG exploration, facilitated by flexible views. The goal is to enhance user understanding and this is tested through a user study. Flexible views are made possible by applying complex constraint definitions against data instances. When data points (and their relations) match a data shape, the flexible view provides an adaptive perspective of that data. The InK Browser already provides a flexible view for geospatial data (a map) and metadata (semantic and type information), as well as search functionality. This research has added a new functionality in Flexible Views, a KG summarization that is utilized within the InK Browser by the dynamic creation of SPARQL queries made from shortcuts of the used schema. This functionality aids the challenge of navigation and comprehension of KGs.

    Committee: Cogan Shimizu Ph.D. (Advisor); Hugh P. Salehi Ph.D. (Committee Member); Krishnaprasad Thirunarayan Ph.D. (Committee Member) Subjects: Computer Science
  • 4. Ngwobia, Sunday Capturing Knowledge of Emerging Entities from the Extended Search Snippets

    Master of Computer Science (M.C.S.), University of Dayton, 2019, Computer Science

    Google and other search engines feature the entity search by representing a knowledge card summarizing related facts about the user-supplied entity. However, the knowledge card is limited to certain entities which have a Wiki page or an entry in encyclopedias such as Freebase. The current encyclopedias are limited to highly popular entities which are far fewer compared with the emerging entities. Despite the availability of knowledge about the emerging entities on the search results, yet there are no approaches to capture, abstract, summarize, fuse, and validate fragmented pieces of knowledge about them. Thus, in this paper, we develop approaches to capture two types of knowledge about the emerging entities from a corpus extended from top-n search snippets of a given emerging entity. The first kind of knowledge identifies the role(s) of the emerging entity as, e.g., who is s/he? The second kind captures the entities closely associated with the emerging entity. As the testbed, we considered a collection of 20 emerging entities and 20 popular entities as the ground truth. Our approach is an unsupervised approach based on text analysis and entity embeddings. Our experimental studies show promising results as the accuracy of more than 87% for recognizing entities and 75% for ranking them. Beside 87% of the entailed types were recognizable. Our testbed and source codes are available on Github (https://github.com/sunnyUD/research_source_code).

    Committee: Saeedeh Shekarpour Ph.D (Committee Chair); Ju Shen Ph.D (Committee Member); Zhongmei Yao Ph.D (Committee Member); Tam Nguyen Ph.D (Committee Member); James Buckley Ph.D (Advisor) Subjects: Computer Science; Information Systems
  • 5. Lalithsena, Sarasi Domain-specific Knowledge Extraction from the Web of Data

    Doctor of Philosophy (PhD), Wright State University, 2018, Computer Science and Engineering PhD

    Domain knowledge plays a significant role in powering a number of intelligent applications such as entity recommendation, question answering, data analytics, and knowledge discovery. Recent advances in Artificial Intelligence and Semantic Web communities have contributed to the representation and creation of this domain knowledge in a machine-readable form. This has resulted in a large collection of structured datasets on the Web which is commonly referred to as the Web of data. The Web of data continues to grow rapidly since its inception, which poses a number of challenges in developing intelligent applications that can benefit from its use. Majority of these applications are focused on a particular domain. Hence they can benefit from a relevant portion of the Web of Data. For example, a movie recommendation application predominantly requires knowledge of the movie domain and a biomedical knowledge discovery application predominantly requires relevant knowledge on the genes, proteins, chemicals, disorders and their interactions. Using the entire Web of data is both unnecessary and computationally intensive, and the irrelevant portion can add to the noise which may negatively impact the performance of the application. This motivates the need to identify and extract relevant data for domain-specific applications from the Web of data. Therefore, this dissertation studies the problem of domain-specific knowledge extraction from the Web of data. The rapid growth of the Web of data takes place in three dimensions: 1) the number of knowledge graphs, 2) the size of the individual knowledge graph, and 3) the domain coverage. For example, the Linked Open Data (LOD), which is a collection of interlinked knowledge graphs on the Web, started with 12 datasets in 2007, and has evolved to more than 1100 datasets in 2017. DBpedia, which is a knowledge graph in the LOD, started with 3 million entities and 400 million relationships in 2012, and now has grown up to 38:3 million en (open full item for complete abstract)

    Committee: Amit Sheth Ph.D. (Advisor); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Derek Doran Ph.D. (Committee Member); Cory Henson Ph.D. (Committee Member); Saeedeh Shekarpour Ph.D. (Committee Member) Subjects: Computer Science
  • 6. Yadav, Govind Enhancing the Accuracy of Large Language Models in Biomedical Research through Knowledge Graph Integration: GenoQueryAI

    MS, University of Cincinnati, 2024, Engineering and Applied Science: Computer Science

    This thesis addresses the critical challenges of Large Language Models (LLMs) in biomedical research, focusing on reducing hallucinations and ensuring the secure use of private data. Incorrect or fabricated responses generated by LLMs can directly impact patient care and clinical outcomes. This study aims to reduce these concerns by integrating LLMs with a structured knowledge graph that combines public data from the Unified Medical Language System (UMLS) and private data from the Pediatric Cardiac Genomics Consortium (PCGC). This integration allows LLMs to answer queries using both public and sensitive private datasets without exposing individual records. Advanced techniques for graph traversal and vector embeddings fetch the most relevant context, significantly reducing the risk of hallucination and ensuring context verifiability. Our architecture leverages Neo4j for graph database management and LangChain for orchestrating LLM workflows. Neo4j's capabilities in vector indexing and full-text search enable efficient semantic searches, while LangChain facilitates the integration and management of LLMs and knowledge bases. Advanced ranking techniques such as Reciprocal Rank Fusion (RRF) and FlashRank enhance the relevance and accuracy of retrieved information. Additionally, the system uses the Ragas library to evaluate the quality of generated content and LangSmith for debugging, testing, and monitoring. Benchmarking results demonstrate that this approach significantly reduces hallucination rates compared to general-purpose LLMs like OpenAI's GPT-4 models, providing more reliable and contextually accurate outputs. Furthermore, the system addresses data privacy concerns by securely leveraging sensitive datasets, ensuring that private data is used appropriately without being exposed or used directly to train models. In summary, integrating LLMs with a fused knowledge base of public and private data through Retrieval-Augmented Generation (RAG) models rep (open full item for complete abstract)

    Committee: Jaroslaw Meller Ph.D. (Committee Chair); Michal Kouril Ph.D. (Committee Member); Michael Wagner Ph.D. (Committee Member); Raj Bhatnagar Ph.D. (Committee Member) Subjects: Computer Science
  • 7. Wang, Bao Computational Approaches to Construct and Assess Knowledge Maps for Student Learning

    Master of Science, Miami University, 2022, Computer Science and Software Engineering

    Knowledge maps have been widely used in knowledge elicitation and representation to evaluate and guide students' learning. To improve upon current computational approaches to construct and assess knowledge maps, this thesis adopts a hybrid methodology that combines machine learning techniques and network science. By providing methods to extract features to evaluate knowledge maps and expand the assessment scope by accounting for group interaction and multiple expert maps, this thesis addresses the overall gap of current approaches for map construction and assessment. Specifically, this thesis offers three major contributions: 1) identifying necessary and suļ¬€icient graph features for knowledge maps evaluation, 2) assessing the role of group interaction during knowledge map construction and how group size affects the quality of map construction, and 3) providing an algorithmic framework to capture differences between student maps and multiple expert maps. Finally, this thesis examines the implications for the fields of network science and educational technology of applying knowledge maps in student learning.

    Committee: Philippe Giabbanelli Dr. (Advisor) Subjects: Computer Science; Education
  • 8. Aqeel, Aya EVIDENCE BASED MEDICAL QUESTION ANSWERING SYSTEM USING KNOWLEDGE GRAPH PARADIGM

    Master of Science in Software Engineering, Cleveland State University, 2022, Washkewicz College of Engineering

    Evidence Based Medicine (EBM) is the process of systematically finding, judging, and using research findings as the basis for clinical decisions and has become the standard of medical practice. There are countless new studies and research being published daily. Keeping track of each of them is impossible, not to mention needing to read and comprehend them. While search engines can help healthcare professionals search for a topic with suggesting relevant papers on the topic, healthcare professionals still need to go through the papers and extract relevant information themselves. This is a very time-consuming task as one study on Information Retrieval (IR) practices of healthcare information professionals that it takes on average 4 hours for healthcare information professionals to finish a search task. Moreover, a systematic review study on the barriers to medical residents' practicing of evidence-based medicine revealed that two of the most frequently mentioned barriers for residents were limitations in available time, knowledge, and skills. In this project, we address both problems by building a Medical Question Answering (QA) system that employees semi-supervised information extraction methods in Natural Language Processing (NLP) to construct a large scale Knowledge Graph (KG) from the extracted facts from a large repository of medical research publications. Then, the system translates a given user's question in a natural language to the KG efficiently to extract relevant answers based on evidences to present in a user-friendly manner. The system returns a compilation of summaries for the related evidences with one sentence summary for each evidence relevant to the user's question and the reference to the full publication. The system can help address the barriers of knowledge and skills by providing comprehensive summary of the evidences for a given question in a natural language that eliminates the need to formulate complex structured queries. The system was evalu (open full item for complete abstract)

    Committee: Sunnie Chung (Committee Chair); Satish Kumar (Committee Member); Yongjian Fu (Committee Member); Sunnie Chung (Advisor) Subjects: Artificial Intelligence; Biomedical Research; Medicine
  • 9. Bandyopadhyay, Bortik Querying Structured Data via Informative Representations

    Doctor of Philosophy, The Ohio State University, 2020, Computer Science and Engineering

    Users seek more information today than ever before for data-driven decision-making tasks, which has resulted in the many-fold increase of a wide variety of information retrieval applications. Such applications often require extracting and leveraging information from diverse data types. Of interest is the large-scale structured data type, which contains the structural information and may occasionally include the semantic information. For example, an undirected unweighted graph contains only structural information, whereas the knowledge base, web table, and relational database almost always contain both structural as well as semantic information. Thus, the end applications must effectively extract and lever the requisite information from such data, while responding to user queries in a time-bound manner. Approximate queries on low dimensional task-specific representations of such large scale and often high dimensional data can greatly speed up the response time of the framework, with minimal quality impact. However, the key aspect of constructing such representations is to effectively capture the requisite task-specific information from the large-scale structured data. To this end, we have designed probabilistic and neural model-based low dimensional informative representations of various high dimensional structured data, such that the low dimensional projections effectively capture the requisite structural and/or semantic information required for resolving the querying task. We have demonstrated the effectiveness of such representations through diverse real-world end-user queries on the data. First, we propose a novel probability-based compressed representation of undirected unweighted streaming graphs using min-wise hashing-based neighborhood sketching to preserve the structural property of the graphs. The sketch can be constructed efficiently, stored in user-constrained memory space, and can be easily queried to retrieve useful graph properties like clustering c (open full item for complete abstract)

    Committee: Srinivasan Parthasarathy (Advisor); Huan Sun (Committee Member); Ping Zhang (Committee Member); Harvey Jay Miller (Committee Member) Subjects: Computer Science
  • 10. Albin, Aaron Building an online UMLS knowledge discovery platform using graph indexing

    Master of Science, The Ohio State University, 2014, Computer Science and Engineering

    The UMLS is a rich collection of biomedical concepts which are connected by semantic relations. Using transitively associated information for knowledge discovery has been shown to be effective for many applications in the biomedical field. Although there are a few tools and methods available for extracting transitive knowledge from the UMLS, they usually have major restrictions on the length of transitive relations or on the number of data sources. To overcome these restrictions, the web platform onGrid was developed to support efficient path queries and knowledge discovery on the UMLS. This platform provides several features such as converting natural language queries into UMLS concepts, performing efficient queries, and visualizing the result paths. It also builds relationship and distance matrices for two sets of biomedical terms, making it possible to perform effective knowledge discovery on these concepts. onGrid can be applied to study biomedical concept relations between any two sets or within one set of biomedical concepts. In this work, onGrid is used to study the gene-gene relationships in HUGO as well as disease-disease relationships in OMIM. By cross validating the results with external datasets, it is demonstrated that onGrid is very efficient to be used for conceptual-based knowledge discovery on the UMLS. onGrid is a very efficient tool for querying the UMLS for transitive relations, studying relationships between biomedical terms, and generating hypotheses. The online UMLS knowledge discovery platform has been tested on the BMI Netlab server (URL: https://netlab.bmi.osumc.edu/ongrid).

    Committee: Yang Xiang (Advisor); Rajiv Ramnath (Committee Member) Subjects: Computer Science