Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 6)

Mini-Tools

 
 

Search Report

  • 1. Das, Manirupa Neural Methods Towards Concept Discovery from Text via Knowledge Transfer

    Doctor of Philosophy, The Ohio State University, 2019, Computer Science and Engineering

    Novel contexts, consisting of a set of terms referring to one or more concepts, often arise in real-world querying scenarios such as; a complex search query into a document retrieval system or a nuanced subjective natural language question. The concepts in these queries may not directly refer to entities or canonical concept forms occurring in any fact-based or rule-based knowledge source such as a knowledge base or ontology. Thus, in addressing the complex information needs expressed by such novel contexts, systems using only such sources can fall short. Moreover, hidden associations meaningful in the current context, may not exist in a single document, but in a collection, between matching candidate concepts having different surface realizations, via alternate lexical forms. These may refer to underlying latent concepts, i.e., existing or conceived concepts or semantic classes that are accessible only via their surface forms. Inferring these latent concept associations in an implicit manner, by transferring knowledge from the same domain; within a collection, or from across domains (different collections), can potentially better address such novel contexts. Thus latent concept associations may act as a proxy for a novel context. This research hypothesizes that leveraging hidden associations between latent concepts may help to address novel contexts in a downstream recommendation task, and that knowledge transfer methods may aid and augment this process. With novel contexts and latent concept associations as the foundation, I define the process of concept discovery from text by two steps: first, "matching"; the novel context to an appropriate hidden relation between latent concepts, and second, "retrieving" the surface forms of the matched related concept as the discovered terms or concept. Our prior study provides insight into how the transfer of knowledge within and across domains can help to learn associations between concepts, informing downstream prediction (open full item for complete abstract)

    Committee: Rajiv Ramnath (Advisor); Eric Fosler-Lussier (Advisor); Sun Huan (Committee Member) Subjects: Computer Engineering; Computer Science; Information Science; Library Science; Linguistics
  • 2. Krisnadhi, Adila Ontology Pattern-Based Data Integration

    Doctor of Philosophy (PhD), Wright State University, 2015, Computer Science and Engineering PhD

    Data integration is concerned with providing a unified access to data residing at multiple sources. Such a unified access is realized by having a global schema and a set of mappings between the global schema and the local schemas of each data source, which specify how user queries at the global schema can be translated into queries at the local schemas. Data sources are typically developed and maintained independently, and thus, highly heterogeneous. This causes difficulties in integration because of the lack of interoperability in the aspect of architecture, data format, as well as syntax and semantics of the data. This dissertation represents a study on how small, self-contained ontologies, called ontology design patterns, can be employed to provide semantic interoperability in a cross-repository data integration system. The idea of this so-called ontology pattern- based data integration is that a collection of ontology design patterns can act as the global schema that still contains sufficient semantics, but is also flexible and simple enough to be used by linked data providers. On the one side, this differs from existing ontology-based solutions, which are based on large, monolithic ontologies that provide very rich semantics, but enforce too restrictive ontological choices, hence are shunned by many data providers. On the other side, this also differs from the purely linked data based solutions, which do offer simplicity and flexibility in data publishing, but too little in terms of semantic interoperability. We demonstrate the feasibility of this idea through the actual development of a large scale data integration project involving seven ocean science data repositories from five institutions in the U.S. In addition, we make two contributions as part of this dissertation work, which also play crucial roles in the aforementioned data integration project. First, we develop a collection of more than a dozen ontology design patterns that capture the key noti (open full item for complete abstract)

    Committee: Pascal Hitzler Ph.D. (Advisor); Krzysztof Janowicz Ph.D. (Committee Member); Khrisnaprasad Thirunarayan Ph.D. (Committee Member); Michelle Cheatham Ph.D. (Committee Member) Subjects: Computer Science; Information Systems; Information Technology; Logic
  • 3. Kapanipathi, Pavan Personalized and Adaptive Semantic Information Filtering for Social Media

    Doctor of Philosophy (PhD), Wright State University, 2016, Computer Science and Engineering PhD

    Social media has experienced immense growth in recent times. These platforms are becoming increasingly common for information seeking and consumption, and as part of its growing popularity, information overload pose a significant challenge to users. For instance, Twitter alone generates around 500 million tweets per day and it is impractical for users to have to parse through such an enormous stream to find information that are interesting to them. This situation necessitates efficient personalized filtering mechanisms for users to consume relevant, interesting information from social media. Building a personalized filtering system involves understanding users' interests and utilizing these interests to deliver relevant information to users. These tasks primarily include analyzing and processing social media text which is challenging due to its shortness in length, and real-time nature of the medium. The challenges include: (1) Lack of semantic context: Social Media posts are on an average short in length, which provides limited semantic context to perform textual analysis. This is particularly detrimental for topic identification which is a necessary task for mining users' interests; (2) Dynamically changing vocabulary: Most social media websites such as Twitter and Facebook generate posts that are of current (timely) interests to the users. Due to this real-time nature, information relevant to topics dynamically evolve reflecting the changes in the real world. This in turn changes the vocabulary associated with these dynamic topics of interest making it harder to filter relevant information; (3) Scalability: The number of users on social media platforms are significantly large, which is difficult for centralized systems to scale to deliver relevant information to users. This dissertation is devoted to exploring semantics and Semantic Web technologies to address the above mentioned challenges in building a personalized information filtering system for social me (open full item for complete abstract)

    Committee: Amit Sheth Ph.D. (Advisor); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Derek Doran Ph.D. (Committee Member); Prateek Jain Ph.D. (Committee Member) Subjects: Computer Science; Technology
  • 4. Alhindawi, Nouh Supporting Source Code Comprehension During Software Evolution and Maintenance

    PHD, Kent State University, 2013, College of Arts and Sciences / Department of Computer Science

    This dissertation addresses the problems of program comprehension to support the evolution of large-scale software systems. The research concerns how software engineers locate features and concepts along with categorizing changes within very large bodies of source code along with their versioned histories. More specifically, advanced Information Retrieval (IR) and Natural Language Processing (NLP) are utilized and enhanced to support various software engineering tasks. This research is not aimed at directly improving IR or NLP approaches; rather it is aimed at understanding how additional information can be leveraged to improve the final results. The work advances the field by investigating approaches to augment and re-document source code with different types of abstract behavior information. The hypothesis is that enriching the source code corpus with meaningful descriptive information, and integrating this orthogonal information (semantic and structural) that is extracted from source code, will improve the results of the IR methods for indexing and querying information. Moreover, adding this new information to a corpus is a form of supervision. That is, apriori knowledge is often used to direct and supervise machine-learning and IR approaches. The main contributions of this dissertation involve improving on the results of previous work in feature location and source code querying. The dissertation demonstrates that the addition of statically derived information from source code (e.g., method stereotypes) can improve the results of IR methods applied to the problem of feature location. Further contributions include showing the effects of eliminating certain textual information (comments and function calls) from being included when performing source code indexing for feature/concept location. Moreover, the dissertation demonstrates an IR-based method of natural language topic extraction that assists developers in gaining an overview of past maintenance (open full item for complete abstract)

    Committee: Jonathan Maletic Professor (Advisor) Subjects: Computer Science
  • 5. Meqdadi, Omar UNDERSTANDING AND IDENTIFYING LARGE-SCALE ADAPTIVE CHANGES FROM VERSION HISTORIES

    PHD, Kent State University, 2013, College of Arts and Sciences / Department of Computer Science

    A systematic study of the adaptive maintenance process is undertaken. The research aims to better understand how developers adapt and migrate systems in response to such things as large API changes. The ultimate goal is to support the construction of automated methods and tools to support the adaptive maintenance process. The main case study involves an exhaustive manual investigation of a number of open source systems (e.g., KOffice, Extragear/graphics, and OpenGL) during a time when a large adaptive maintenance task was taking place. In each case the adaptive maintenance task involved a substantial API migration (e.g., Qt3 to Qt4) that took place over multiple years. Additionally, the systems were also undergoing other modifications (perfective and corrective) such as bug fixing and the addition of new features. The main goal of the study was to identify and distinguish the adaptive maintenance changes from the other types of changes. These adaptive maintenance commits are then analyzed to identify common characteristics and trends. The analysis examines the amount of change taking place for each commit, the vocabulary of the commit messages, the authorship of the changes, and the stereotype of modified methods. The data provides a point of reference for the study of these types of changes. This is also the first published in-depth and systematic examination of large adaptive maintenance tasks. The results show that adaptive maintenance tasks involve a relatively few number of large changes. There are also few developers involved in this task and they use a somewhat standard vocabulary in describing the associated commits. This information is they used as a means to automatically identify adaptive changes. An information retrieval technique, namely Latent Semantic Analysis, is used to retrieve relevant adaptive commits when querying the commits available in the version control system. The approach was found to have good accuracy. Our results show tha (open full item for complete abstract)

    Committee: Jonathan Maletic Professor (Advisor) Subjects: Computer Science
  • 6. Kumar, Vijay Specification, Configuration and Execution of Data-intensive Scientific Applications

    Doctor of Philosophy, The Ohio State University, 2010, Computer Science and Engineering

    Recent advances in digital sensor technology and numerical simulations of real-world phenomena are resulting in the acquisition of unprecedented amounts of raw digital data. Terms like ‘data explosion' and ‘data tsunami' have come to describe the uncontrolled rate at which scientific datasets are generated by automated sources ranging from digital microscopes and telescopes to in-silico models simulating the complex dynamics of physical and biological processes. Scientists in various domains now have secure, affordable access to petabyte-scale observational data gathered over time, the analysis of which, is crucial to scientific discovery. The availability of commodity components have fostered the development of large distributed systems with high-performance computing resources to support the execution requirements of scientific data analysis applications. Increased levels of middleware support over the years have aimed to provide high scalability of application execution on these systems. However, the high-resolution, multi-dimensional nature of scientific datasets, and the complexity of analysis requirements present challenges to efficient application execution on such systems. Traditional brute-force analysis techniques to extract useful information from scientific datasets may no longer meet desired performance levels at extreme data scales. This thesis builds on a comprehensive study involving multi-dimensional data analysis applications at large data scales, and identifies a set of advanced factors or parameters to this class of applications which can be customized in domain-specific ways to obtain substantial improvements in performance. A useful property of these applications is their ability to operate at multiple performance levels based on a set of trade-off parameters, while providing different levels of quality-of-service (QoS) specific to the application instance. To avail the performance benefits brought about by such facto (open full item for complete abstract)

    Committee: P Sadayappan PhD (Advisor); Joel Saltz MD, PhD (Committee Member); Gagan Agrawal PhD (Committee Member); Umit Catalyurek PhD (Committee Member) Subjects: Computer Science