Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 9)

Mini-Tools

 
 

Search Report

  • 1. Mendes, Pablo Adaptive Semantic Annotation of Entity and Concept Mentions in Text

    Doctor of Philosophy (PhD), Wright State University, 2014, Computer Science and Engineering PhD

    The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications. Tagging textual documents according to these knowledge bases is a challenging task. It involves recognizing the entities and concepts that have been mentioned in a particular passage and attempting to resolve eventual ambiguity of language in order to choose one of many possible meanings for a phrase. There has been substantial work on recognizing and disambiguating entities for specialized applications, or constrained to limited entity types and particular types of text. In the context of shared knowledge bases, since each application has potentially very different needs, systems must have unprecedented breadth and flexibility to ensure their usefulness across applications. Documents may exhibit different language and discourse characteristics, discuss very diverse topics, or require the focus on parts of the knowledge repository that are inherently harder to disambiguate. In practice, for developers looking for a system to support their use case, is often unclear if an existing solution is applicable, leading those developers to trial-and-error and ad hoc usage of multiple systems in an attempt to achieve their objective. In this dissertation, I propose a conceptual model that unifies related techniques in this (open full item for complete abstract)

    Committee: Amit P. Sheth Ph.D. (Advisor); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Shajoun Wang Ph.D. (Committee Member); Sören Auer Ph.D. (Committee Member) Subjects: Computer Science
  • 2. Henning, Kyle THE IMPACT OF METACOGNITIVE REPRESENTATIONS AND FEEDBACK ON CHILDREN'S DISAMBIGUATION PREDICTION

    PHD, Kent State University, 2022, College of Arts and Sciences / Department of Psychological Sciences

    Even one-year-olds show the so-called disambiguation effect, which is a tendency to select a novel object rather than a familiar object as the referent of a novel label. The strength of this effect increases over the preschool years. This age trend may be due in part to advances in metacognition. The accuracy of preschoolers' lexical knowledge judgment mediates the association between age and strength of the disambiguation effect. Also, their judgments about object label knowledge accounted for why most 4-year-olds, but only a few 3-year-olds could predict the solution to a new disambiguation problem before hearing the novel label (Henning & Merriman, 2019). Study 1 tested whether more preschoolers could make this kind of prediction if they were told that the labels were ones “you have never heard before.” Results supported this hypothesis, but only for the younger children. Also, children's tendency to make these predictions was positively associated with their ability to give accurate reports of whether various words or pseudowords had known meanings. Study 2, which used an online rather than face-to-face testing procedure, demonstrated that 3-year-olds only learned how to solve the original prediction problem if they received direct rather than indirect feedback. When they receive helpful cues, most 3-year-olds can solve a disambiguation problem before hearing the novel label. Thus, most 3-year-olds can form metacognitive representations of the elements of the disambiguation problem and use these to draw inferences about the reference of a label.

    Committee: William Merriman (Advisor); Maria Zaragoza (Committee Member); Bradley Morris (Committee Member); Jeffrey Ciesla (Committee Member) Subjects: Cognitive Psychology; Developmental Psychology; Psychology
  • 3. Rasmussen, Nathan Broad-domain Quantifier Scoping with RoBERTa

    Doctor of Philosophy, The Ohio State University, 2022, Linguistics

    This thesis reports development of a new, broad-domain quantifier scope corpus including all of the factors, for use training and testing the system. Training materials, a work process, and the annotator-facing data format were each designed to reduce barriers to entry and safeguard accuracy, with revisions resulting from an inter-annotator agreement study and error analysis. The thesis discusses appropriate measures of agreement for scope annotations, both between human annotators and between predicted and gold labels. For appropriate calculation of chance-corrected agreement between human annotators, an inter-annotation distance metric is introduced and justified. For evaluation of automated predictions, where human-like constraints on the structure of a set of predictions are not enforced, results are evaluated both for small-scale accuracy and for compliance with these holistic constraints. The scoping data of the corpus are developed into a natural language understanding task suitable for automatic prediction, framing it as a span pair classification problem, with outscoping treated as a semantic dependency between words. This thesis reports the application of the RoBERTa language model to this task. The model encodes properties of lexis, syntax, and semantics that correlate with human scoping judgements (`scoping factors'). Previously published scope-annotated corpora and scope prediction systems either do not cover all of the scoping factors, do not apply them to the full set of quantifiers, or do not represent the full range of subject-matter domains in which humans routinely predict quantifier scope. Predictions from the RoBERTa system are shown to be more accurate than the majority-prediction baseline, to a degree not due to chance. The system successfully complies with the holistic constraints. The system's principal shortcomings are its relatively small improvement over the baseline, its dependence on some other system to screen p (open full item for complete abstract)

    Committee: William Schuler (Advisor); Micha Elsner (Committee Member); Michael White (Committee Member) Subjects: Linguistics
  • 4. Tallo, Philip Using Sentence Embeddings for Word Sense Induction

    MS, University of Cincinnati, 2020, Engineering and Applied Science: Computer Science

    One of the primary goals of the field of Natural Language Processing is to create very high-quality text embeddings which can be used in many domains. The main area which text embedding methods typically fall short is in handling polysemy detection. A word is polysemous when it has multiple meanings (e.g. the word bank when used in a financial context versus an ecological context). Current text embedding methods fail to handle this at all, training just one embedding for all meanings of a word. Discovering methods for handling polysemy detection is an active area of research. This thesis presents a Word Sense Induction (WSI) system which is based on the hypothesis that by clustering sentence embeddings it is possible to achieve a clustering over sense embeddings as well. Subsequently, this hypothesis this thesis uses the SemEval 2010 benchmark to test the Sentence based WSI (S-WSI) methodology and compare it with state-of-the- art methods in the field. This benchmark is based on four key metrics: homogeneity, completeness, precision, and recall. The key advantages of the approach proposed in this thesis compared to other methods is adaptability. This S-WSI methodology can use any sentence embedding model or clustering method making it highly adaptable to the user's domain specific needs. This method is highly dependent on the sentence embedding model which is being used with some models achieving near SOTA performance whereas some models only performing slightly better than pure random.

    Committee: Ali Minai Ph.D. (Committee Chair); Raj Bhatnagar Ph.D. (Committee Member); Anca Ralescu Ph.D. (Committee Member) Subjects: Computer Science
  • 5. Slocum, Jeremy The Role of Metacognition in Children's Disambiguation of Novel Name Reference

    PHD, Kent State University, 2019, College of Arts and Sciences / Department of Psychological Sciences

    When shown a familiar and a novel object and asked to pick the referent of a novel label, even one-year-olds tend to favor the novel object (Halberda, 2003; Mervis & Bertrand, 1994). However, this so-called disambiguation effect becomes stronger as children develop through preschool age (Lewis & Frank, 2015). Advances in metacognition may play a role in this developmental trend. Preschoolers' awareness of their own lexical knowledge is associated with the strength of the disambiguation effect (Merriman & Schuster, 1991; Merriman & Bowman, 1989; Wall, Merriman, & Scofield, 2015). It is also associated with whether children can solve purely metacognitive forms of the disambiguation problem (Slocum & Merriman, 2018; Henning & Merriman, 2019). The current experiments tested the hypothesis that as the number of choices in a disambiguation problem increases, the frequency of correct response declines more sharply for children who lack awareness of lexical knowledge than for children who possessed it. The results of the first two experiments supported the main hypothesis. Two experiments also showed that awareness of lexical knowledge was associated with a more gradual increase in latency of correct solutions as number of choices increased. In Experiment 3, children's eye movements were recorded as they attempted to solve 3-, 4-, 5-, and 6-choice problems. Various aspects of children's eye movements were analyzed, including the number of familiar object foils checked, the number of revisits to the target, and the proportion of looking time spent on the target object. The current experiments advance our insight into why the “awareness-of-knowledge advantage” in solving disambiguation problems tends to increase as number of choices increases.

    Committee: William Merriman PhD (Advisor); Clarissa Thompson PhD (Committee Member); Jeff Ciesla PhD (Committee Member); Bradley Morris PhD (Committee Member); Sarah Rilling PhD (Committee Member) Subjects: Cognitive Psychology; Developmental Psychology; Psychology
  • 6. Wijeratne, Sanjaya A Framework to Understand Emoji Meaning: Similarity and Sense Disambiguation of Emoji using EmojiNet

    Doctor of Philosophy (PhD), Wright State University, 2018, Computer Science and Engineering PhD

    Pictographs, commonly referred to as `emoji', have become a popular way to enhance electronic communications. They are an important component of the language used in social media. With their introduction in the late 1990's, emoji have been widely used to enhance the sentiment, emotion, and sarcasm expressed in social media messages. They are equally popular across many social media sites including Facebook, Instagram, and Twitter. In 2015, Instagram reported that nearly half of the photo comments posted on Instagram contain emoji, and in the same year, Twitter reported that the `face with tears of joy' emoji has been tweeted 6.6 billion times. As of 2017, Facebook and Facebook Messenger processed over 60 million and 6 billion messages with emoji per day, respectively. Emogi, an Internet marketing firm, reports that over 92% of all online users have used emoji at least once. Creators of the SwiftKey Keyboard for mobile devices report that they process 6 billion messages per day that contain emoji. Moreover, business organizations have adopted and now accept the use of emoji in professional communication. For example, Appboy, an Internet marketing company, reports that there has been a 777% year-over-year increase and 20% month-over-month increase in emoji usage for marketing campaigns by business organizations in 2016. These statistics leave little doubt that emoji are a significant and important aspect of electronic communication across the world. The ability to automatically process and interpret text fused with emoji will be essential as society embraces emoji as a standard form of online communication. In the same way that natural language is processed with sophisticated machine learning techniques and technologies for many important applications, including text similarity and word sense disambiguation, emoji should also be amenable to such analysis. Yet the pictorial nature of emoji, the fact that the same emoji may be used in different contexts to express di (open full item for complete abstract)

    Committee: Amit Sheth Ph.D. (Advisor); Derek Doran Ph.D. (Committee Member); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Wenbo Wang Ph.D. (Committee Member) Subjects: Artificial Intelligence; Computer Engineering; Computer Science; Sociolinguistics
  • 7. Henning, Kyle The Development of a Metacognitive Disambiguation Effect: Novel Name Presentation Not Required

    MA, Kent State University, 2018, College of Arts and Sciences / Department of Psychological Sciences

    Children tend to select a novel object rather than a familiar object when asked to identify the referent of a novel label. Current accounts of this so-called disambiguation effect do not address whether children have an abstract metacognitive representation of the effect. Do they represent their selection for each novel label as being based on the novelty contrast between the objects? In two experiments (each N = 48), 3- and 4-year-olds were told they were playing a game. In each round, they completed a disambiguation trial for a different novel label. After four rounds, they received additional rounds in which after being shown the familiar and unfamiliar object, but before being told the novel label, they were asked which object “was going to be right.” If children represented their responses in the game as based on a novelty contrast, they would predict that the unfamiliar object would be the correct response. Most 4-year-olds made this prediction, whereas most 3-year-olds did not. Performance was associated with the accuracy of children's reports of their object name knowledge. Development of a representation of the disambiguation effect as a novelty contrast may depend on development of a tendency to represent familiar objects as “ones I know” and unfamiliar objects as “ones I don't know.”

    Committee: William Merriman (Advisor) Subjects: Cognitive Psychology; Developmental Psychology; Psychology
  • 8. Jianguo, Li Hybrid Methods for Acquisition of Lexical Information: the Case for Verbs

    Doctor of Philosophy, The Ohio State University, 2008, Linguistics

    Improved automatic text understanding requires detailed linguistic information about the words that comprise the text. Particularly crucial is the knowledge about predicates, typically verbs, which communicate both the event being expressed and how participants are related to the event. Although the field of natural language processing (NLP) has yet to develop a clear consensus on guidelines for building a verb lexicon suitable for applications in NLP, class-based construction of verb lexicons (e.g. Levin verb classification) has proved beneficial to a wide range of NLP tasks in combating the pervasive problem of data sparsity. Such broad coverage dictionaries and ontologies are difficult and costly to create and maintain by hand, it is therefore desirable to learn them from distributional data, such as can be obtained from unlabeled text corpora. To this end, this thesis will primarily address the following three questions: First, deriving Levin-style verb classifications from text corpora helps avoid the expensive hand-coding of such information, but appropriate features must be identified and demonstrated to be effective. One of our primary goals is to assess the linguistic conditions which are crucial for lexical classification of verbs. In particular, we experiment with different ways of mixing syntactic and lexical information for improved verb classification. The results show that both syntactic and lexical information are useful in automatic verb classification. Second, Levin verb classification provides a systematic account of verb polysemy. We propose a class-based method for disambiguating Levin verbs using only untagged data. The basic working hypothesis is that verbs in the same Levin class tend to share their subcategorization patterns as well as neighboring words. In practice, information about unambiguous verbs is used to disambiguate ambiguous ones. The results suggest that this class-based method can be used in the absence of hand-tagged data. Las (open full item for complete abstract)

    Committee: Chris Brew (Advisor); Eric Fosler-Lussier (Committee Member); Mike White (Committee Member) Subjects: Linguistics
  • 9. Konduri, Aparna CLustering of Web Services Based on Semantic Similarity

    Master of Science, University of Akron, 2008, Computer Science

    Web Services are proving to be a convenient way to integrate distributed software applications. As service-oriented architecture is getting popular, vast numbers of web services have been developed all over the world. But it is a challenging task to find the relevant or similar web services using web services registry such as UDDI. Current UDDI search uses keywords from web service and company information in its registry to retrieve web services. This information cannot fully capture user's needs and may miss out on potential matches. Underlying functionality and semantics of web services need to be considered. In this study, we explore semantics of web services using WSDL operation names and parameter names along with WordNet. We compute semantic similarity of web services and use this data to generate clusters. Then, we use a novel approach to represent the clusters and utilize that information to further predict similarity of any new web services. This approach has really yielded good results and can be efficiently used by any web service search engine to retrieve similar or related web services.

    Committee: Chien-Chung Chan (Advisor) Subjects: Computer Science