Search Results (1 - 7 of 7 Results)

Sort By  
Sort Dir
 
Results per page  

Joshi, Amit KrishnaExploiting Alignments in Linked Data for Compression and Query Answering
Doctor of Philosophy (PhD), Wright State University, 2017, Computer Science and Engineering PhD
Linked data has experienced accelerated growth in recent years due to its interlinking ability across disparate sources, made possible via machine-processable RDF data. Today, a large number of organizations, including governments and news providers, publish data in RDF format, inviting developers to build useful applications through reuse and integration of structured data. This has led to tremendous increase in the amount of RDF data on the web. Although the growth of RDF data can be viewed as a positive sign for semantic web initiatives, it causes performance bottlenecks for RDF data management systems that store and provide access to data. In addition, a growing number of ontologies and vocabularies make retrieving data a challenging task. The aim of this research is to show how alignments in the Linked Data can be exploited to compress and query the linked datasets. First, we introduce two compression techniques that compress RDF datasets through identification and removal of semantic and contextual redundancies in linked data. Logical Linked Data Compression is a lossless compression technique which compresses a dataset by generating a set of new logical rules from the dataset and removing triples that can be inferred from these rules. Contextual Linked Data Compression is a lossy compression technique which compresses datasets by performing schema alignment and instance matching followed by pruning of alignments based on confidence value and subsequent grouping of equivalent terms. Depending on the structure of the dataset, the first technique was able to prune more than 50% of the triples. Second, we propose an Alignment based Linked Open Data Querying System (ALOQUS) that allows users to write query statements using concepts and properties not present in linked datasets and show that querying does not require a thorough understanding of the individual datasets and interconnecting relationships. Finally, we present LinkGen, a multipurpose synthetic Linked Data generator that generates a large amount of repeatable and reproducible RDF data using statistical distribution, and interlinks with real world entities using alignments.

Committee:

Pascal Hitzler , Ph.D. (Advisor); Guozhu Dong, Ph.D. (Committee Member); Krishnaprasad Thirunaraya, Ph.D. (Committee Member); Michelle Cheatham, Ph.D. (Committee Member); Subhashini Ganapathy, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Linked Data; RDF Compression; Ontology Alignment; Linked Data Querying; Synthetic RDF Generator; SPARQL

Pschorr, Joshua KennethSemSOS : an Architecture for Query, Insertion, and Discovery for Semantic Sensor Networks
Master of Science (MS), Wright State University, 2013, Computer Science
With sensors, storage, and bandwidth becoming ever cheaper, there has been a drive recently to make sensor data accessible on the Web. However, because of the vast number of sensors collecting data about our environment, finding relevant sensors on the Web and then interpreting their observations is a non-trivial challenge. The Open Geospatial Consortium (OGC) defines a web service specification known as the Sensor Observation Service (SOS) that is designed to standardize the way sensors and sensor data are discovered and accessed on the Web. Though this standard goes a long way in providing interoperability between sensor data producers and consumers, it is predicated on the idea that the consuming application is equipped to handle raw sensor data. Sensor data consuming end-points are generally interested in not just the raw data itself, but rather actionable information regarding their environment. The approaches for dealing with this are either to make each individual consuming application smarter or to make the data served to them smarter. This thesis presents an application of the latter approach, which is accomplished by providing a more meaningful representation of sensor data by leveraging semantic web technologies. Specifically, this thesis describes an approach to sensor data modeling, reasoning, discovery, and query over richer semantic data derived from raw sensor descriptions and observations. The artifacts resulting from this research include: - an implementation of an SOS service which hews to both Sensor Web and Semantic Web standards in order to bridge the gap between syntactic and semantic sensor data consumers and that has been proven by use in a number of research applications storing large amounts of data, which serves as - an example of an approach for designing applications which integrate syntactic services over semantic models and allow for interactions with external reasoning systems. As more sensors and observations move online and as the Internet of Things becomes a reality, issues of integration of sensor data into our everyday lives will become important for all of us. The research represented by this thesis explores this problem space and presents an approach to dealing with many of these issues. Going forward, this research may prove a useful elucidation of the design considerations and affordances which can allow low-level sensor and observation data to become the basis for machine processable knowledge of our environment.

Committee:

Krishnaprasad Thirunarayan, Ph.D. (Advisor); Amit Sheth, Ph.D. (Committee Member); Bin Wang, Ph.D. (Committee Member)

Subjects:

Computer Science; Geographic Information Science; Information Systems; Remote Sensing; Systems Design; Web Studies

Keywords:

Semantic Web; Sensor Web; Linked Data; Semantic Sensor Web; Sensor Data; Sensor Web Enablement; Sensor Observation Service;

Mixter, JeffreyLinked Data in VRA Core 4.0: Converting VRA XML Records into RDF/XML
MLIS, Kent State University, 2013, College of Communication and Information / School of Library and Information Science
Linked Data has become an increasingly important and valuable way for sharing data across the Internet. It is the basis for the Semantic Web and allows organizations to not only easily share data, but also connect data with other related data. Visual Resource Association (VRA) Core 4 is an XML schema-based data model for cataloging cultural objects and visual resources. Using the existing VRA Core 4 restricted XML schema, a new data model was developed that took advantage of popular domain specific vocabularies. Using popular vocabularies such as Schema.org, helps ensure that data will be interoperable with other data and can potentially help improve visibility on the Internet. Using the data model as a reference, an ontology was developed using Protege ontology editor. It illustrated how popular domain specific vocabularies can be combined with the existing VRA data model to create a new semantically-rich model that still retains the specificity and detail of the original XML restricted schema. In addition to developing a new VRA data model, an XSLT stylesheet was created that demonstrated how existing XML based records could be converted into RDF data. The stylesheet was used to successfully convert a 4,150 record collection from the University of Notre Dame into RDF triples. The XSLT templates used in the stylesheet were able to not only convert the existing XML elements/attributes into RDF classes/properties but also convert the existing controlled vocabulary terms into functioning http URIs representing concepts. The study successfully demonstrated that existing data models can be enhanced to incorporate Linked Data and that existing datasets of implementation-specific XML records can be converted into RDF triples with properties defined by popular RDF vocabularies using an XSLT stylesheet.

Committee:

Marcia Zeng, Ph.D. (Advisor); Yin Zhang, Ph.D. (Committee Member); Athena Salaba, Ph.D. (Committee Member)

Subjects:

Information Science; Library Science

Keywords:

Linked Data; RDF; VRA; visual resource cataloging; data conversion; data modeling; Ontology; XSLT stylesheet;

Gunaratna, KalpaSemantics-based Summarization of Entities in Knowledge Graphs
Doctor of Philosophy (PhD), Wright State University, 2017, Computer Science and Engineering PhD
The processing of structured and semi-structured content on the Web has been gaining attention with the rapid progress in the Linking Open Data project and the development of commercial knowledge graphs. Knowledge graphs capture domain-specific or encyclopedic knowledge in the form of a data layer and add rich and explicit semantics on top of the data layer to infer additional knowledge. The data layer of a knowledge graph represents entities and their descriptions. The semantic layer on top of the data layer is called the schema (ontology), where relationships of the entity descriptions, their classes, and the hierarchy of the relationships and classes are defined. Today, there exist large knowledge graphs in the research community (e.g., encyclopedic datasets like DBpedia and Yago) and corporate world (e.g., Google knowledge graph) that encapsulate a large amount of knowledge for human and machine consumption. Typically, they consist of millions of entities and billions of facts describing these entities. While it is good to have this much knowledge available on the Web for consumption, it leads to information overload, and hence proper summarization (and presentation) techniques need to be explored. In this dissertation, we focus on creating both \textit{comprehensive} and \textit{concise} entity summaries at: (i) the single entity level and (ii) the multiple entity level. To summarize a single entity, we propose a novel approach called FACeted Entity Summarization (FACES) that considers importance, which is computed by combining popularity and uniqueness, and diversity of facts getting selected for the summary. We first conceptually group facts using semantic expansion and hierarchical incremental clustering techniques and form facets (i.e., groupings) that go beyond syntactic similarity. Then we rank both the facts and facets using Information Retrieval (IR) ranking techniques to pick the highest ranked facts from these facets for the summary. The important and unique contribution of this approach is that because of its generation of facets, it adds diversity into entity summaries, making them comprehensive. For creating multiple entity summaries, we propose RElatedness-based Multi-Entity Summarization (REMES) approach that simultaneously processes facts belonging to the given entities using combinatorial optimization techniques. In this process, we maximize diversity and importance of facts within each entity summary and relatedness of facts between the entity summaries. The proposed approach uniquely combines semantic expansion, graph-based relatedness, and combinatorial optimization techniques to generate relatedness-based multi-entity summaries. Complementing the entity summarization approaches, we introduce a novel approach using light Natural Language Processing (NLP) techniques to enrich knowledge graphs by adding type semantics to literals. This makes datatype properties semantically rich compared to having only implementation types. As a result of the enrichment process, we could use both object and datatype properties in the entity summaries, which improves coverage. Moreover, the added type semantics can be useful in other applications like dataset profiling and data integration. We evaluate the proposed approaches against the state-of-the-art methods and highlight their capabilities for single and multiple entity summarization.

Committee:

Amit Sheth, Ph.D. (Committee Co-Chair); Krishnaprasad Thirunarayan, Ph.D. (Committee Co-Chair); Keke Chen, Ph.D. (Committee Member); Gong Cheng, Ph.D. (Committee Member); Edward Curry, Ph.D. (Committee Member); Hamid Motahari Nezhad, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Entity Summarization; Clustering; Semantic Web; Multiple Constraint Optimization; Typing; RDF; RDFS; OWL; Linked Data; Natural Language Processing; Artificial Intelligence

Petiya, SeanBuilding a Semantic Web of Comics: Publishing Linked Data in HTML/RDFa Using a Comic Book Ontology and Metadata Application Profiles
MLIS, Kent State University, 2014, College of Communication and Information / School of Library and Information Science
Information about the various resources, concepts, and entities in the world of comics can be found in a wide range of systems, including those of libraries, archives, and museums, as well as the records of independent research projects. Semantic Web technologies and standards represent an opportunity to connect these resources using Linked Data. In an attempt to realize this opportunity, this thesis presents a case study for the development of a domain ontology for comic books and comic book collections. In the initial phase, reference resources and example materials were collected and consulted to develop a representative domain model and core schema. A workflow was then developed to convert common CSV data to XML and RDF/XML, replacing common values with LOD URIs using XSLT. The second phase of the study then focused on publishing Linked Data using HTML/RDFa. A review of existing information systems and an analysis of their content was conducted in order to address the usability of the vocabulary, and inform the design of a series of modularized metadata application profiles using the core schema as a base. Examples were tested for their ability to produce valid, meaningful RDF data from HTML content that was consistent with the ontology. The final result is an RDFS/OWL Web vocabulary for comics, titled the Comic Book Ontology (CBO). It is an open and extensible semantic model that identifies comics using two components: (a) the form and (b) the container. This approach allows the Ontology’s conceptualization of comics to include comic books, comic strips, web comics, graphic novels, manga, or original artwork, with the potential for further describing other aspects of comics culture and scholarship, or connecting, community created data to Semantic Web applications, such as next-generation library catalogs.

Committee:

Marcia Lei Zeng, Ph.D. (Advisor); Karen F. Gracy, Ph.D. (Committee Member); David B. Robins, Ph.D. (Committee Member)

Subjects:

Information Science; Information Systems; Library Science; Web Studies

Keywords:

comic books; graphic novels; semantic web; linked data; ontology; metadata; application profile; usability

Krisnadhi, Adila AlfaOntology Pattern-Based Data Integration
Doctor of Philosophy (PhD), Wright State University, 2015, Computer Science and Engineering PhD
Data integration is concerned with providing a unified access to data residing at multiple sources. Such a unified access is realized by having a global schema and a set of mappings between the global schema and the local schemas of each data source, which specify how user queries at the global schema can be translated into queries at the local schemas. Data sources are typically developed and maintained independently, and thus, highly heterogeneous. This causes difficulties in integration because of the lack of interoperability in the aspect of architecture, data format, as well as syntax and semantics of the data. This dissertation represents a study on how small, self-contained ontologies, called ontology design patterns, can be employed to provide semantic interoperability in a cross-repository data integration system. The idea of this so-called ontology pattern- based data integration is that a collection of ontology design patterns can act as the global schema that still contains sufficient semantics, but is also flexible and simple enough to be used by linked data providers. On the one side, this differs from existing ontology-based solutions, which are based on large, monolithic ontologies that provide very rich semantics, but enforce too restrictive ontological choices, hence are shunned by many data providers. On the other side, this also differs from the purely linked data based solutions, which do offer simplicity and flexibility in data publishing, but too little in terms of semantic interoperability. We demonstrate the feasibility of this idea through the actual development of a large scale data integration project involving seven ocean science data repositories from five institutions in the U.S. In addition, we make two contributions as part of this dissertation work, which also play crucial roles in the aforementioned data integration project. First, we develop a collection of more than a dozen ontology design patterns that capture the key notions in the ocean science occurring in the participating data repositories. These patterns contain axiomatization of the key notions and were developed with an intensive involvement from the domain experts. Modeling of the patterns was done in a systematic workflow to ensure modularity, reusability, and flexibility of the whole pattern collection. Second, we propose the so-called pattern views that allow data providers to publish their data in very simple intermediate schema and show that they can greatly assist data providers to publish their data without requiring a thorough understanding of the axiomatization of the patterns.

Committee:

Pascal Hitzler, Ph.D. (Advisor); Krzysztof Janowicz, Ph.D. (Committee Member); Khrisnaprasad Thirunarayan, Ph.D. (Committee Member); Michelle Cheatham, Ph.D. (Committee Member)

Subjects:

Computer Science; Information Systems; Information Technology; Logic

Keywords:

ontology design pattern; data integration; ontology; semantic web; pattern view; ocean science; OWL; linked data; ontology modeling; axiomatization; collaborative modeling;

Clunis, Julaine SashanieDesigning an Ontology for Managing the Diets of Hypertensive Individuals
MLIS, Kent State University, 2016, College of Communication and Information / School of Library and Information Science
Making use of semantic technologies to combine various resources into one integrated environment, this study developed an ontology for hypertensive individuals to gain a better understanding of the nutrients in foods and recipes and what effect these have on the disease, prescribed drugs, and their general health. In particular, 10% of a sample of 500 recipes obtained from the web, and 10% of the food items from the USDA nutrient database was used as data in the ontology which had 75 classes, 22 object properties, and 33 data properties. The study established proto-personas to aid in development of competency questions which would be used with the Pellet reasoner to test whether the ontology could inform about nutrition goals for hypertensive individuals. The testing results provide evidence to support the idea that an ontology may be used to provide guidance to individuals with chronic disease, highlighting what foods may be safely consumed and which may cause problems. The conclusion is that an ontology can be successfully used to provide support to medical personnel and advance the cause of patient engagement as they seek to manage chronic illnesses such as hypertension.

Committee:

Marcia Zeng, Ph.D. (Advisor); Rebecca Meehan, Ph.D. (Advisor); Karen Gracy, Ph.D. (Committee Member)

Subjects:

Health; Information Science; Information Systems; Library Science; Web Studies

Keywords:

ontology; ontologies; semantic technologies; integrated environments; linked data; semantic web; hypertension; chronic disease management; knowledge organization systems; knowledge management; recipes; food;