Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
thesis-draft-final-etd.pdf (3.58 MB)
ETD Abstract Container
Abstract Header
Methods in Text Mining for Diagnostic Radiology
Author Info
Johnson, Eamon B.
ORCID® Identifier
http://orcid.org/0000-0002-5272-2780
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073
Abstract Details
Year and Degree
2016, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
Abstract
Information extraction from clinical medical text is a challenge in computing to bring structure to the prose produced for communication in medical practice. In diagnostic radiology, prose reports are the primary means for communication of image interpretation to patients and other physicians, yet secondary use of the report requires either costly review by another radiologist or machine interpretation. In this work, we present mechanisms for improving machine interpretation of domain-specific text with large scale semantic analysis, using a corpus of 726,000 real-world radiology reports as a basis for experimentation. We examine the abstract conceptual problem of detection of incidental findings (uncertain or unexpected results) in imaging study reports. We demonstrate that classifiers incorporating semantic metrics can outperform F-measure of prior methods for follow-up classification and also outperform F-measure of incidental findings classification by physicians in-clinic (0.689 versus 0.648). Further, we propose two semantic metrics, focus and divergence, as calculated over the SNOMED-CT ontology graph, for summarization and projection of discrete report concepts into 2-dimensional space which enables both machine classification and physician interpretation of classifications. With understanding of the utility of semantic metrics for classification, we present methods for enhancing extraction of semantic information from clinical corpora. First, we construct a zero-knowledge method for imputation of semantic class for unlabeled terms through maximization of a confidence factor computed using pairwise co-occurrence statistics and rules limiting recall. Experiments with our method on corpora of reduced Mandelbrot information temperature produce accurate labeling of up to 25% of terms not labeled by prior methods. Second, we propose a method for context-sensitive quantification of relative concept salience and an algorithm capable of increasing both salience and diversity of concepts in document summaries in 28% of reports.
Committee
Gultekin Ozsoyoglu (Committee Chair)
Marc Buchner (Committee Member)
Adam Perzynski (Committee Member)
Andy Podgurski (Committee Member)
Pages
125 p.
Subject Headings
Computer Science
Keywords
text mining
;
diagnostic radiology
;
information extraction
;
clinical text mining
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Johnson, E. B. (2016).
Methods in Text Mining for Diagnostic Radiology
[Doctoral dissertation, Case Western Reserve University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073
APA Style (7th edition)
Johnson, Eamon.
Methods in Text Mining for Diagnostic Radiology.
2016. Case Western Reserve University, Doctoral dissertation.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073.
MLA Style (8th edition)
Johnson, Eamon. "Methods in Text Mining for Diagnostic Radiology." Doctoral dissertation, Case Western Reserve University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
case1459514073
Download Count:
2,134
Copyright Info
© 2016, all rights reserved.
This open access ETD is published by Case Western Reserve University School of Graduate Studies and OhioLINK.