Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 3)

Mini-Tools

 
 

Search Report

  • 1. Janga, Prudhvi Integration of Heterogeneous Web-based Information into a Uniform Web-based Presentation

    PhD, University of Cincinnati, 2014, Engineering and Applied Science: Computer Science and Engineering

    With the continuing explosive growth of the world wide web, a wealth of information has become available online. The web has become one of the major sources of information for both individual users and large organizations. To find the information, individual users can either use search engines or navigate to a particular website following links. The former method returns links to vast amounts of data in seconds while the latter one could be tedious and time consuming. The presentation of results using the former method is usually a web page with links to actual web data sources (or websites). The latter method takes the user to the actual web data source itself. Using the two most popular forms of web data presentation/retrieval, web data can hardly be queried, manipulated and analyzed easily even though it is publicly and readily available. Many companies also use web for information whose challenge is to build web-based analytical and decision support systems, often referred to as web data warehouses. However, the information present on the web is extremely complex and heterogeneous which brings along with it a challenge in integrating and presenting retrieved web data in a uniform format. Hence, there is a need for different web data integration frameworks that can integrate and present web data in a uniform format. To achieve a homogeneous representation of web data we need a framework that extracts relevant structured and semi-structured web data from different web data sources, generates schemas from structured as well as semi-structured web data, and integrates schemas generated from different structured and semi-structured web data sources into a merged schema, populates it with data and presents it to the end user in a uniform format. We propose a modular framework for homogeneous presentation of web data. This framework consists of different standalone modules that can also be used to create independent systems that solve other schema unification problem (open full item for complete abstract)

    Committee: Karen Davis Ph.D. (Committee Chair); Raj Bhatnagar Ph.D. (Committee Member); Hsiang-Li Chiang Ph.D. (Committee Member); Ali Minai Ph.D. (Committee Member); Carla Purdy Ph.D. (Committee Member) Subjects: Computer Science
  • 2. Dutko, Adam THE RELATIONAL DATABASE: A NEW STATIC ANALYSIS TOOL?

    Master of Science in Software Engineering, Cleveland State University, 2011, Fenn College of Engineering

    Code comprehension is pivotal to reducing errors in software. Reading source code improves code comprehension and enables effective fixes but as a code base grows meta-data become increasingly important. Static Analysis techniques provide an avenue for software developers to learn more about their code through meta-data while also helping them safely detect potential errors in their source. Unfortunately, many Static Analysis tools have a steep learning curve and are limited in scope. This thesis seeks to make Static Analysis accessible and extensible by asking what ubiquitous tools like SQL and relational databases can offer and what they cannot. We begin to answer these questions by exploring the source code of three C++ projects (libodbc++, log4cxx, C++ Sockets Library) using a new Static Analysis tool called Trike. Initial results indicate Trike is a promising and accessible tool for analyzing the structure of a code base. With further improvements, Trike should equal more established Static Analysis tools in scope and surpass them in usability.

    Committee: Nigamanth Sridhar PhD (Committee Chair); Yongjian Fu PhD (Committee Member); Wenbing Zhao PhD (Committee Member) Subjects: Computer Science
  • 3. Lam, Wilma Samhita Samuel A MapReduce Performance Study of XML Shredding

    MS, University of Cincinnati, 2016, Engineering and Applied Science: Computer Science

    XML is an extensible markup language that came into popularity for its ease of use and readability. It has emerged as one of the leading media used for data storage and transfer over the World Wide Web as it is platform independent, readable, and can be used to share data between programs. There are tools available for extraction of data directly from XML documents, but many organizations use relational databases as repositories to store, manipulate, and analyze XML data. The data can be extracted into a database to reduce the redundancy present in XML documents by eliminating the repetition of tags while preserving the values. Several algorithms have been devised to provide efficient shredding (mapping of XML data to relational tables) of XML documents. The shredding of an XML document is performed through a set of sequential steps that traverse the tree structure from root node to leaf nodes. Sequential processing of large XML documents is time consuming, therefore we devise a method to implement parallelization by splitting a large XML document into a set of smaller XML documents. We extend a shredding algorithm to process the XML documents in parallel. We conduct experiments with parallel and sequential implementations on a single machine and a parallel MapReduce implementation in the cloud. We compare the performance of the three implementations for several real-world datasets and different parameters such as partition sizes. Our experiments indicate that the performance of the algorithms can be predicted through parameters such as the number of elements at depth 1 of an XML dataset. These parameters help identify a suitable implementation for shredding. Our experiments also indicate that MapReduce is a scalable environment that performs better for larger partition sizes.

    Committee: Karen Davis Ph.D. (Committee Chair); Raj Bhatnagar Ph.D. (Committee Member); Carla Purdy Ph.D. (Committee Member) Subjects: Computer Science