Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 19)

Mini-Tools

 
 

Search Report

  • 1. Singh, Aniket Sentiment Analysis & Time Series Analysis on Stock Market

    Master of Computing and Information Systems, Youngstown State University, 2023, Department of Computer Science and Information Systems

    Investors are always looking for ways to make profit in the stock market. Predicting this highly volatile market has been historically challenging. This study explores the use of the social media platform, Twitter, and Machine Learning Algorithm for Time Series Analysis. Our findings suggested that Twitter's data may not be the best for Sentiment Analysis, while other machine learning techniques for Time Series Analysis such as LSTM would be effective. This could potentially help an investor with higher returns.

    Committee: John R. Sullins PhD (Advisor); Feng George Yu PhD (Committee Member); Alina Lazar PhD (Committee Member) Subjects: Artificial Intelligence; Computer Science
  • 2. Joa, Youngnyo A Hyperlink and Sentiment Analysis of the 2016 Presidential Election: Intermedia Issue Agenda and Attribute Agenda Setting in Online Contexts

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2017, Media and Communication

    This study investigated the intermedia agenda-setting dynamics among various media Twitter accounts during the last seven weeks before the 2016 U.S. presidential election. Media Twitter accounts included in analysis were those of print media, television networks, news magazines, online partisan media, online non-partisan media, and political commentators. This study applied the intermedia agenda-setting theory as the theoretical framework, and network analysis and computer-assisted content analysis enabling hyperlink and sentiment analysis as the methods. A total of 5,595,373 relationships built via Tweets among media Twitter accounts was collected. After removal of irrelevant data, a total of 16,794 relationships were used for analysis. The results showed that traditional media Twitter accounts, such as print media and television networks, play roles in the Tweeting network by bridging isolated media Twitter accounts, and are located in the center of networks, so that information reaches them quickly; further, they are connected to other important accounts. Together with the changes in the volume of Tweeting that signaled media interest, the set of popular URLs and keywords/word pairs in Tweets also served as sensors that detected media Twitter accounts' interest about that time. The results also supported the previous research findings that, as political events, the debates affect the production and dissemination patterns of news. Not only did the volume of Tweeting produced spiked immediately after each debate, but various types of hyperlinks and sentiment words used in Tweets increased as well. The number of negative sentiment words observed in the Tweeting network surpassed the number of positive sentiment words observed in the Tweeting network across different time points, and the gap between them decreased as the election approached. The use of positive and negative sentiment words differed across different media Twitter account categories. Online non- (open full item for complete abstract)

    Committee: Gi Woong Yun PhD (Committee Co-Chair); Kate Magsamen-Conrad PhD (Committee Co-Chair); Sung-Yeon Park PhD (Committee Member); Bill Albertini PhD (Committee Member) Subjects: Journalism; Mass Communications
  • 3. Yeboah, Jones A Hybrid Approach for using Natural Language Processing Techniques to Assess User Feedback on Static Analysis Tools

    PhD, University of Cincinnati, 2024, Education, Criminal Justice, and Human Services: Information Technology

    In the field of software development, Static Analysis Tools (SAT) have become increasingly popular for identifying and fixing potential issues in code. These tools help improve code quality and reduce the occurrence of bugs and vulnerabilities. However, evaluating the effectiveness of SAT can be challenging due to the subjective nature of user reviews. To address this challenge, we selected four popular SATs as case studies and they include SonarQube, FindBugs, Checkstyle, and PMD. The first part of this research involves conducting an empirical study for evaluating the performance of SATs. We compared the performance of the four SATs in detecting software defects in diverse open-source Java projects. The study results show that SonarQube performs considerably better than all other tools in defect detection. The second part of this research focuses on the user perspective by evaluating the performance of the SATs through sentiment analysis of user reviews. The study found that user sentiment is a valuable indicator of a tool's effectiveness and reliability. Positive user feedback typically corresponds to higher performance ratings, reflecting greater user satisfaction and tool efficiency. Conversely, negative sentiments often point to performance issues and user dissatisfaction. Thus, incorporating sentiment analysis can provide meaningful insights into the perceived quality and performance of SAT. In the third study, we applied topic modeling techniques to user reviews of SATs. Our analysis highlights the key aspects that users find beneficial and areas where improvements are needed. The findings provide valuable insights into user concerns and preferences, informing the development of more user-friendly and effective SAT. In our fourth study, we propose a theoretical framework that integrates sentiment analysis, composite sentiment score, topic modeling, and emotion detection to extract meaningful insights from user feedback. By quantifying polarity and subjectiv (open full item for complete abstract)

    Committee: Saheed Popoola Ph.D. (Committee Chair); Yanran Liu Ph.D. (Committee Member); Isaac Kofi Nti Ph.D. (Committee Member); M. Murat Ozer Ph.D. (Committee Member) Subjects: Information Technology
  • 4. Troyer, Katelyn Unlocking the Dynamics: Exploring the Role of Personality Traits in Shaping Organizational Culture and Performance

    Bachelor of Arts, Wittenberg University, 2024, Business

    Personality psychology is well established in business and organization theory and praxis, so it would seem to require no introduction. Whereas personality psychology is established in these areas, there is benefit from interrogating how it has been situated. A textual analysis of the introduction section of the top 50 most-relevant, academic papers identified using the keywords personality and business provided a basis for generating insights and implications as to how personality psychology has been “introduced” within business contexts, as examined academically. This mixed-methods, textual analysis produced words of merit, bigrams of merit, and AFINN-based sentiment scores. Among the results, the words stress and conflict, and the bi-grams of unintended consequences and whistle blowing stood out as topics of focus. The results suggest no statistically significant difference in sentiment between articles frequently referencing the business-related terms and those frequently referencing students, with each group having a slightly positive, average sentiment. Combining the bigrams and words of merit with the sentiment analysis facilitated the identification of common themes that informed understanding and suggested action. These insights and implications suggest a need for further collaboration in this area with a focus on enacting positive change to organizational practices.

    Committee: Ross Jackson (Advisor); Layla Besson (Committee Member); Rachel Wilson (Committee Member) Subjects: Business Administration; Organizational Behavior; Personality; Personality Psychology; Psychology
  • 5. Ayyalasomayajula, Meghana Image Emotion Analysis: Facial Expressions vs. Perceived Expressions

    Master of Computer Science (M.C.S.), University of Dayton, 2022, Computer Science

    A picture is worth a thousand words. A single image has the power to influence individuals and change their behaviour, whereas a single word does not. Even a barely visible image, displayed on a screen for only a few milliseconds, appears to be capable of changing one's behaviour. In this thesis, we experimentally investigated the relationship between facial expressions and perceived emotions. To this end, we built two datasets, namely, the image dataset for image emotion analysis and the face dataset for expression recognition. During the annotation of the image dataset, both facial expressions and perceived emotions are recorded via a mobile application. We then use a classifier trained on the face dataset to recognize the user's expression and compare it with the perceived emotion.

    Committee: Tam Nguyen (Advisor) Subjects: Computer Science
  • 6. Desai, Urvashi Student Interaction Network Analysis on Canvas LMS

    Master of Computer Science, Miami University, 2020, Computer Science and Software Engineering

    Network analysis techniques help investigate the significance of nodes/actors that play central roles where the nodes represent people, and the links represent the communication between them. This thesis analyzes how collaboration helps students' learning process and proposes a tool that could be integrated with Canvas to analyze student discussion data. To begin, we analyzed data collected from online student discussions on Canvas, in a Level-1 Programming course. These discussion topics were classified into classroom experiences/learning, question/answers, opinions, and comments. Modeling of the patterns of discussion board interactions as networks and applying various node-based network measures helped to unravel the similarities of student interaction patterns, and gain insights into their progress in the course. The experimental analyses include finding the most challenging/debated topics in the course, analyzing the leadership and team-based qualities, and analyzing trends in student participation. The results of the study reveal that participation in online discussion forums has a positive impact on the students' grades. In summary, the inferences drawn from this research can help instructors understand the student learning behaviors/patterns and guide the development of better pedagogical approaches that benefit students to overcome the common misconceptions that they confront in the course concepts.

    Committee: Vijayalakshmi Ramasamy (Advisor); James Kiper (Committee Member); Hakam Alomari (Committee Member) Subjects: Computer Science; Education
  • 7. Smith, Michael IDENTIFYING TOXIC EVENTS IN TIME

    MS, Kent State University, 2024, College of Arts and Sciences / Department of Computer Science

    Online communities have long suffered from issues caused by a lack of accountability for participants exhibiting toxic behaviors. Difficulty with providing effective moderation, sufficiently dissuading would-be offenders, identifying problem users, and mitigating toxic activity in real-time has led to an unwelcoming environment for users. It's difficult to effectively police communication networks to provide safe environment's when participants are both anonymous and cannot be sufficiently identified as problematic. Our study employs temporal multivariate data mining and pattern analysis, and natural language processing techniques to examine organic conversations across a large collection of online gaming communities' messages. By analyzing instances of toxic behavior, arguments, and profane conversation, our objective is to identify the distinct features that characterize toxicity in digital environments. Our study analyzed conversational data extracted from four video game focused Discord communities. The dataset encompasses a rich collection of 685,432 public messages. Using the Perspective API, messages were classified against six metrics relating to toxicity. To elucidate the temporal dynamics and complex patterns of these interactions, we employed Temporal Multidimensional Scaling and utilized a Shannon Entropy Visualization method. Additionally, manual review was performed on a subset of 140,000 comments' worth of toxic events. We then leveraged BERTopic for cluster analysis to deduce related thematic concerns. For a nuanced representation of these themes, we customized the topic modeling using OpenAI's GPT-3.5 Turbo language model, enriching our understanding of the contextual underpinnings of toxicity in online gaming discourse. Our study found that toxic events occurred without warning and rapidly dissipated as the conversation went on. Toxicity is extremely rare relative to the general activity of the community and is largely contributed by eith (open full item for complete abstract)

    Committee: Ruoming Jin (Advisor) Subjects: Artificial Intelligence; Computer Science
  • 8. Burwell, Emily The impact of sentiment and misinformation cycling through the social media platform, Twitter, during the initial phase of the COVID-19 vaccine rollout

    Master of Science (MS), Wright State University, 2022, Biological Sciences

    This study assesses the underlying topics, sentiment, and types of information regarding COVID-19 vaccines on Twitter during the initiation of the vaccine rollout. Tweets about the COVID-19 vaccine were collected and the relevant tweets were then filtered out using a relevancy classifier. Latent Dirichlet Allocation (LDA) was used to uncover topics of discussion within the relevant tweets. The NRC lexicon was used to assess positive and negative sentiment within tweets. The type of information (information, misinformation, opinion, or question) in tweets was evaluated. The relevancy classifier resulted in a dataset of 210,657 relevant tweets. Eight topics provided the best representation of the relevant tweets. Tweets with negative sentiment were associated with a higher percentage of misinformation. Tweets with positive sentiment showed a higher percentage of information. The proliferation of information and misinformation on social media platforms are associated with building trust and mitigating negative sentiment associated with COVID-19 vaccines.

    Committee: William Romine Ph.D. (Advisor); Jeffrey Peters Ph.D. (Committee Member); Paula Bubulya Ph.D. (Committee Member) Subjects: Biology; Epidemiology; Health Education; Health Sciences; Public Health; Public Health Education
  • 9. Yalamanchi, Neha A Longitudinal Study of Mental Health Patterns from Social Media

    MS, Kent State University, 2021, College of Arts and Sciences / Department of Computer Science

    The prominent malady afflicting individuals across world is mental health concerns that are majorly undiagnosed and untreated as there is a stigma that surrounds it to this day. As a result of the growth in social media, people are highly inclined to post their feelings and troubles on social media forums like Reddit, which is a popular topic-based forum that promotes anonymity among its users. It was predicted that Covid19 would negatively impact mental health as it left unemployment and uncertainty in its wake. This thesis aims to investigate the trends of mental health trepidations through the use of Natural Language Processing and Machine Learning Algorithms, employing Unsupervised Topic Modeling and Clustering on the data extracted from key mental health related subreddits. The data is categorized into three classes, namely: pre-pandemic, mid-pandemic and post-pandemic. The results of research disclose an alarming rate of increase in distress on numerous subreddits, wherein there was heightened mention of anxiety and sexual abuse as a result of the unprecedented times caused by the widespread of coronavirus.

    Committee: Ruoming Jin (Advisor); Gokarna Sharma (Committee Member); Xiang Lian (Committee Member); Deric Kenne (Committee Member) Subjects: Computer Science
  • 10. Sahasrabudhe, Aditya NBA 2020 Finals: Big Data Analysis of Fans' Sentiments on Twitter

    Master of Science (MS), Ohio University, 2021, Journalism (Communication)

    The NBA 2020 playoffs were unprecedented in many ways, courtesy of the COVID-19 pandemic. Participating athletes temporarily lived in the bubble , away from their families, and the games were played without the in-person audience. Since the players were in the bubble in Orlando, Florida, fans -- except for the teams' staff and support -- watched the NBA finals virtually, mostly from the comfort of their homes. Twitter, a social networking site (SNS), was widely used as a source of NBA news and information and more importantly as a communication tool by NBA fans. This research examined fans' tweets from Sep. 30, 2020, to Oct. 12, 2020, using verified Twitter handles of the NBA 2020 finalists - @Lakers and @MiamiHEAT. Sentiment analysis of fans' tweets provided insights into fans' indirect impression management tendencies. Theoretical frameworks used in the study were (a) basking in reflected glory (BIRG), (b) basking in reflected failure (BIRF), (c) blasting, (d) social identity theory and fan identification, and (e) disposition theory to evaluate fans' tweets with varied levels of sentiments. The findings from the sentiment analyses showed a wide range of sentiments expressed in fans' tweets based on factors such as game importance, team identification, result and a combination of result and game importance. This study offered a deeper understanding of fan behavior via Twitter conversations and instances when fans portray one sentiment over another in their tweets.

    Committee: Hans Meyer (Committee Chair); Christina Beck (Committee Member); Roger Aden (Committee Member) Subjects: Behavioral Sciences; Communication
  • 11. Alsehaimi, Afnan Sentiment Analysis for E-book Reviews on Amazon to Determine E-book Impact Rank

    Master of Computer Science (M.C.S.), University of Dayton, 2021, Computer Science

    User-generated content platforms have changed the dynamics of the business environment and redefined how organizations and governments communicate with the public. Further, such platforms act as the primary means to measure customer satisfaction. Thus, those organizations need to analyze the content generated by their customer to extract their opinions then decide based on trustable information. Also, knowing user behavior and perception for a specific product is useful to customers in the decision-making process. In this thesis, a comparative study has been conducted to develop a model to measure customer satisfaction on Amazon e-book products by applying natural language processing (NLP), machine learning, deep learning, and text mining techniques on costumers reviews. This thesis will study the possibility of generating a rating based on sentiment analysis of each product instead of rating-based stars, which is already applied to the Amazon e-book rating system.

    Committee: James Buckley Ph.D. (Committee Chair); Saeedeh Shekarpour Ph.D. (Committee Member); Tam Nguyen Ph.D. (Committee Member) Subjects: Computer Science
  • 12. Aring, Danielle Integrated Real-Time Social Media Sentiment Analysis Service Using a Big Data Analytic Ecosystem

    Master of Computer and Information Science, Cleveland State University, 2017, Washkewicz College of Engineering

    Big data analytics are at the center of modern science and business. Our social media networks, mobile devices and enterprise systems generate enormous volumes of it on a daily basis. This wide range of availability provides many organizations in every field opportunities to discover valuable intelligence for critical decision-making. However, traditional analytic architectures are insufficient to handle unprecedentedly big volume of data and complexity of data processing. This thesis presents an analytic framework to combat unprecedented scale of big data that performs data stream sentiment analysis effectively in real time. The work presents a Social Media Big Data Sentiment Analytics Service System (SMBDSASS). The architecture leverages Apache Spark stream data processing framework, coupled with a NoSQL Hive big data ecosystem. Two sentiment analysis models were developed; the first, a topic based model, given user provided topic or person of interest sentiment (opinion) analysis was performed on related topic sentences in a tweet stream. The second, an aspect (feature) based model given user provided product of interest and related product features aspect (feature) analysis was performed on reviews containing important feature terms. The experimental results of the proposed framework using real time tweet stream and product reviews show comparable improvements from the results of the existing literature, with 73% accuracy for topic-based sentiment model, and 74% accuracy for aspect (feature) based sentiment model. The work demonstrated that our topic and aspect based sentiment analysis models on the real time stream data processing framework using Apache Spark and machine learning classifiers coupled with a NoSQL big data ecosystem offer an efficient, scalable, real-time stream data-processing alternative for the complex multiphase sentiment analysis over common batch data mining frameworks.

    Committee: Sun Sunnie Chung Ph.D. (Committee Chair); Yongjigan Fu Ph.D. (Committee Member); Ifthkar Sikder Ph.D. (Committee Member) Subjects: Computer Science
  • 13. Chen, Lu Mining and Analyzing Subjective Experiences in User Generated Content

    Doctor of Philosophy (PhD), Wright State University, 2016, Computer Science and Engineering PhD

    Web 2.0 and social media enable people to create, share and discover information instantly anywhere, anytime. A great amount of this information is subjective information -- the information about people's subjective experiences, ranging from feelings of what is happening in our daily lives to opinions on a wide variety of topics. Subjective information is useful to individuals, businesses, and government agencies to support decision making in areas such as product purchase, marketing strategy, and policy making. However, much useful subjective information is buried in ever-growing user generated data on social media platforms, it is still difficult to extract high quality subjective information and make full use of it with current technologies. Current subjectivity and sentiment analysis research has largely focused on classifying the text polarity -- whether the expressed opinion regarding a specific topic in a given text is positive, negative, or neutral. This narrow definition does not take into account the other types of subjective information such as emotion, intent, and preference, which may prevent their exploitation from reaching their full potential. This dissertation extends the definition and introduces a unified framework for mining and analyzing diverse types of subjective information. We have identified four components of a subjective experience: an individual who holds it, a target that elicits it (e.g., a movie, or an event), a set of expressions that describe it (e.g., "excellent", "exciting"), and a classification or assessment that characterize it (e.g., positive vs. negative). Accordingly, this dissertation makes contributions in developing novel and general techniques for the tasks of identifying and extracting these components. We first explore the task of extracting sentiment expressions from social media posts. We propose an optimization-based approach that extracts a diverse set of sentiment-bearing expressions, including formal and sl (open full item for complete abstract)

    Committee: Amit Sheth Ph.D. (Advisor); Krishnaprasad Thirunarayan Ph.D. (Committee Member); Keke Chen Ph.D. (Committee Member); Ingmar Weber Ph.D. (Committee Member); Justin Martineau Ph.D. (Committee Member) Subjects: Computer Science; Information Science; Information Technology
  • 14. Sinha, Vinayak Sentiment Analysis On Java Source Code In Large Software Repositories

    Master of Computing and Information Systems, Youngstown State University, 2016, Department of Computer Science and Information Systems

    While developers are writing code to accomplish the task assigned to them, their sentiments play a vital role and have a massive impact on quality and productivity. Sentiments can have either a positive or a negative impact on the tasks being performed by developers. This thesis presents an analysis of developer commit logs for GitHub projects. In particular, developer sentiment in commits is analyzed across 28,466 projects within a seven-year time frame. We use the Boa infrastructure's online query system to generate commit logs as well as files that were changed during the commit. Two existing sentiment analysis frameworks (SentiStrength and NLTK) are used for sentiment extraction. We analyze the commits in three categories: large, medium, and small based on the number of commits using sentiment analysis tools. In addition, we also group the data based on the day of week the commit was made and map the sentiment to the file change history to determine if there was any correlation. Although a majority of the sentiment was neutral, the negative sentiment was about 10% more than the positive sentiment overall. Tuesdays seem to have the most negative sentiment overall. In addition, we do find a strong correlation between the number of files changed and the sentiment expressed by the commits the files were part of. It was also observed that SentiStrength and NLTK show consistent results and similar trends. Future work and implications of these results are discussed.

    Committee: Bonita Sharif PhD (Advisor); Alina Lazar PhD (Committee Member); John Sullins PhD (Committee Member) Subjects: Computer Science; Information Technology; Organizational Behavior
  • 15. Ruan, Yiye Joint Dynamic Online Social Network Analytics Using Network, Content and User Characteristics

    Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering

    Online social networks (OSNs) allow Internet users all over the globe to share information, exchange thoughts, and work collaboratively. Not only do OSNs provide a channel of broadcasting real-world events as they unfold, they also enable a convenient way for users to exchange experience and opinions. Understanding the relation among network topology, users, content, and their dynamics can have a significant impact both from a theoretical standpoint as well as from a practical one, for instance, to understand online user behaviors and predict future online activities. In this dissertation, I study the interplay of three important factors that encode most of the OSN dynamics: network structure, user-generated content, and user characteristics. We first present our broader contribution to computer science: the development of two novel graph algorithms for community detection and structural role detection, which are scalable to handle networks containing millions of nodes and edges. Both community and role assignments of nodes generate novel clusterings of OSN users and provide valuable insights into OSN activities, but they are often implicit or even unknown to OSN analysts. We bridge this chasm by designing algorithms that can automatically infer community and role information in large-scale OSN data. Our algorithms are (1) robust in the presence of noise in real-world data, and (2) efficient in processing large network datasets. A key element to both of these contributions is a practical approach for network sparsification which enables efficient processing. Evaluated on various social networks containing hundreds of millions of edges, our algorithms outperform state-of-the-art approaches in terms of the ability of recovering ground truth communities and roles of OSN users. By augmenting the network structure with content information and performing joint inference, our algorithms are able to combat the impact of noise. At the same time, careful design and optim (open full item for complete abstract)

    Committee: Srinivasan Parthasarathy (Advisor); P Sadayappan (Committee Member); Arnab Nandi (Committee Member); Robert Garrett (Committee Member) Subjects: Computer Science
  • 16. Kucuktunc, Onur Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data

    Doctor of Philosophy, The Ohio State University, 2013, Computer Science and Engineering

    Similarity search methods in the literature produce results based on the ranked degree of similarity to the query. However, the results are typically unsatisfactory, especially if there is an ambiguity in the query, or the search space include redundantly repeating similar documents. Diversity in query results is preferred by a variety of applications since diverse results may give a complete view of the queried topic. In this study, we investigate the result diversification task in various application areas, such as opinion retrieval, paper recommendation, with different types of data, such as spatial, high-dimensional data, opinions, citation graph, and other networks. Although the definitions of diversity will differ from field to field, we propose techniques considering the general objective of result diversification, which is to maximize the similarity of search results to the query while minimizing the pairwise similarity between the results, without neglecting the efficiency. For the diversity on spatial and high-dimensional data, we make an analogy with the concept of natural neighbors and propose geometric methods. We also introduce a diverse browsing method based on the popular distance browsing feature of R-tree index structures. Next, we focus on search and retrieval of opinion data on certain entities, and start our analysis by looking at direct correlations between sentiments of opinions and the demographics (e.g., gender, age, education level, etc.) of people that generate those opinions. Based on the analysis, we argue that opinion diversity can be achieved by diversifying the sources of opinions. Recommendation tasks on academic networks also suffer from the mentioned ambiguity and redundancy issues. To observe those effects, we present a paper recommendation framework called theadvisor (http://theadvisor.osu.edu) which recommends new papers to researchers using only the reference-citation relationships between academic papers. We introduce (open full item for complete abstract)

    Committee: Umit V. Catalyurek (Advisor); Srinivasan Parthasarathy (Committee Member); Arnab Nandi (Committee Member) Subjects: Computer Science
  • 17. Nepal, Srijan Linguistic Approach to Information Extraction and Sentiment Analysis on Twitter

    MS, University of Cincinnati, 2012, Engineering and Applied Science: Computer Science

    Social media sites are one of the most popular destinations in today's online world. With millions of users visiting social networking sites like Facebook, YouTube, Twitter etc. every day to share social content at their disposal; from simple textual information about what they are doing at any moment of time, to opinions regarding products, people, events, movies to videos and music, these sites have become massive sources of user generated content. In this work we focus on one such social networking site - Twitter, for the task of information extraction and sentiment analysis. This work presents a linguistic framework that first performs syntactic normalization of tweets on top of traditional data cleaning, extracts assertions from each tweet in the form of binary relations, and creates a contextualized knowledge base (KB). We then present a Language Model (LM) based classifier trained on a small set of manually tagged corpus, to perform sentence level sentiment analysis on the collected assertions to eventually create a KB that is backed by sentiment values. We use this approach to implement a contextualized sentiment based yes/no question answering system.

    Committee: Kenneth Berman PhD (Committee Chair); Fred Annexstein PhD (Committee Member); Anca Ralescu PhD (Committee Member) Subjects: Computer Science
  • 18. Khuc, Vinh Approaches to Automatically Constructing Polarity Lexicons for Sentiment Analysis on Social Networks

    Master of Science, The Ohio State University, 2012, Computer Science and Engineering

    Sentiment analysis is a task of mining subjective information expressed in text, and has received a lot of focus from the research community in Natural Language Processing in recent years. With the rapid growth of social networks, sentiment analysis is becoming much more attractive to Natural Language Processing researchers. Identifying words or phrases that carry sentiments is a crucial task in sentiment analysis. The work in this thesis concentrates on automatically constructing polarity lexicons for sentiment analysis on social networks. One of the challenges in sentiment analysis on social networks is the lack of domain-dependent polarity lexicons and there is a need for automatically constructing sentiment lexicons for any specific domain. Two proposed methods are based on graph propagation and topic modeling. Our experiments confirm the quality of the polarity lexicons constructed using these two algorithms.

    Committee: Rajiv Ramnath Professor (Advisor); Jay Ramanathan Professor (Committee Member) Subjects: Computer Science
  • 19. Xu, Zhe A Sentiment Analysis Model Integrating Multiple Algorithms and Diverse Features

    Master of Science, The Ohio State University, 2010, Computer Science and Engineering

    In this thesis, we propose a model for integrating multiple sentiment analysis algorithms that each cover separate features, and show that it can do better than single algorithms that deal with multiple features. The key idea behind this integration model is the selective use of the right algorithm for the right case. We propose two measures to estimate the effectiveness of an algorithm, and, based on these measures, a two-step process to construct the model based on the understanding of contextual properties of algorithms. Our experiments show that our model outperforms existing baselines.

    Committee: Rajiv Ramnath (Advisor); Belkin Mikhail (Committee Member); Fang Hui (Committee Member) Subjects: Computer Science