Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 1)

Mini-Tools

 
 

Search Report

  • 1. Al-Olimat, Hussein Knowledge-Enabled Entity Extraction

    Doctor of Philosophy (PhD), Wright State University, 2019, Computer Science and Engineering PhD

    Information Extraction (IE) techniques are developed to extract entities, relationships, and other detailed information from unstructured text. The majority of the methods in the literature focus on designing supervised machine learning techniques, which are not very practical due to the high cost of obtaining annotations and the difficulty in creating high quality (in terms of reliability and coverage) gold standard. Therefore, semi-supervised and distantly-supervised techniques are getting more traction lately to overcome some of the challenges, such as bootstrapping the learning quickly. This dissertation focuses on information extraction, and in particular entities, i.e., Named Entity Recognition (NER), from multiple domains, including social media and other grammatical texts such as news and medical documents. This work explores the ways for lowering the cost of building NER pipelines with the help of available knowledge without compromising the quality of extraction and simultaneously taking into consideration feasibility and other concerns such as user-experience. I present a type of distantly supervised (dictionary-based), supervised (with reduced cost using entity set expansion and active learning), and minimally-supervised NER approaches. In addition, I discuss the various aspects of the knowledge-enabled NER approaches and how and why they are a better fit for today's real-world NER pipelines in dealing with and partially overcoming the above-mentioned difficulties. I present two dictionary-based NER approaches. The first technique extracts location mentions from text streams, which proved very effective for stream processing with competitive performance in comparison with ten other techniques. The second is a generic NER approach that scales to multiple domains and is minimally supervised with a human-in-the-loop for online feedback. The two techniques augment and filter the dictionaries to compensate for their incompleteness (due to lexical variat (open full item for complete abstract)

    Committee: Krishnaprasad Thirunarayan Ph.D. (Advisor); Keke Chen Ph.D. (Committee Member); Guozhu Dong Ph.D. (Committee Member); Steven Gustafson Ph.D. (Committee Member); Srinivasan Parthasarathy Ph.D. (Committee Member); Valerie L. Shalin Ph.D. (Committee Member) Subjects: Artificial Intelligence; Computer Science