Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
AshutoshJadhavThesis.pdf (11.11 MB)
ETD Abstract Container
Abstract Header
Knowledge Driven Search Intent Mining
Author Info
Jadhav, Ashutosh
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707
Abstract Details
Year and Degree
2016, Doctor of Philosophy (PhD), Wright State University, Computer Science and Engineering PhD.
Abstract
Understanding users’ latent intents behind search queries is essential for satisfying a user’s search needs. Search intent mining can help search engines to enhance its ranking of search results, enabling new search features like instant answers, personalization, search result diversification, and the recommendation of more relevant ads. Hence, there has been increasing attention on studying how to effectively mine search intents by analyzing search engine query logs. While state-of-the-art techniques can identify the domain of the queries (e.g. sports, movies, health), identifying domain-specific intent is still an open problem. Among all the topics available on the Internet, health is one of the most important in terms of impact on the user and forms one of the most frequently searched areas. This dissertation presents a knowledge-driven approach for domain-specific search intent mining with a focus on health-related search queries. First, we identified 14 consumer-oriented health search intent classes based on inputs from focus group studies and based on analyses of popular health websites, literature surveys, and an empirical study of search queries. We defined the problem of classifying millions of health search queries into zero or more intent classes as a multi-label classification problem. Popular machine learning approaches for multi-label classification tasks (namely, problem transformation and algorithm adaptation methods) were not feasible due to the limitation of label data creations and health domain constraints. Another challenge in solving the search intent identification problem was mapping terms used by laymen to medical terms. To address these challenges, we developed a semantics-driven, rule-based search intent mining approach leveraging rich background knowledge encoded in Unified Medical Language System (UMLS) and a crowd-sourced encyclopedia (Wikipedia). The approach can identify search intent in a disease-agnostic manner and has been evaluated on three major diseases. While users often turn to search engines to learn about health conditions, a surprising amount of health information is also shared and consumed via social media, such as public social platforms like Twitter. Although Twitter is an excellent information source, the identification of informative tweets from the deluge of tweets is the major challenge. We used a hybrid approach consisting of supervised machine learning, rule-based classifiers, and biomedical domain knowledge to facilitate the retrieval of relevant and reliable health information shared on Twitter in real time. Furthermore, we extended our search intent mining algorithm to classify health-related tweets into health categories. Finally, we performed a large-scale study to compare health search intents and features that contribute in the expression of search intent from more than 100 million search queries from smarts devices (smartphones or tablets) and personal computers (desktops or laptops).
Committee
Amit Sheth, Ph.D. (Advisor)
Krishnaprasad Thirunarayan, Ph.D. (Committee Member)
Michael Raymer, Ph.D. (Committee Member)
Jyotishman Pathak, Ph.D. (Committee Member)
Pages
180 p.
Subject Headings
Computer Science
Keywords
Search Intent Mining
;
Semantic Search
;
Health Informatics
;
Text Analytics
;
Semantic Web
;
Search Log Analysis
;
Social Media Analytics
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Jadhav, A. (2016).
Knowledge Driven Search Intent Mining
[Doctoral dissertation, Wright State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707
APA Style (7th edition)
Jadhav, Ashutosh.
Knowledge Driven Search Intent Mining.
2016. Wright State University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707.
MLA Style (8th edition)
Jadhav, Ashutosh. "Knowledge Driven Search Intent Mining." Doctoral dissertation, Wright State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1464464707
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
wright1464464707
Download Count:
1,494
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Wright State University and OhioLINK.