Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 48)

Mini-Tools

 
 

Search Report

  • 1. Buffenbarger, Lauren Ethics in Data Science: Implementing a Harm Prevention Framework

    MDES, University of Cincinnati, 2021, Design, Architecture, Art and Planning: Design

    Advanced data applications can cause unintentional adverse effects that may impair, injure, or set back a person, entity, or society's interests. Interdisciplinary teams are at risk of perpetuating data harm due to a lack of awareness of data harm issues, communication gaps across expertise silos, the prioritization of taskwork over teamwork, and personal biases. To prevent unintentional data harm, teams must develop a harm prevention framework that includes tools to raise awareness, contextualize data harm within the project data lifecycle, and integrate prevention practices. This thesis explores the development of digital tools to facilitate a shared understanding of data harm within the data lifecycle as well as their incorporation into a harm prevention framework.

    Committee: Todd Timney M.F.A. (Committee Chair); James Lee (Committee Member) Subjects: Design
  • 2. Liu, Luyu Accessibility in Motion: Measuring Real-time Accessibility with High-resolution Data

    Doctor of Philosophy, The Ohio State University, 2023, Geography

    Accessibility is a fundamental measure in public transportation research and policy making; it represents a user's ability to travel and reach destination afforded by a public transit service. Accessibility plays a crucial role in a public transit service's useability, a community's livability, and residents' well-being. Higher accessibility can promote modal shifts from other unsustainable transportation to public transit, which decreases the carbon and air pollution, curbs congestion, and improves social equity and public health. However, many US cities still have major gaps in public transit accessibility and reliability. The issue of inadequate public transportation accessibility and reliability also remains a grave concern for scientific and planning communities: the lack of in-depth, holistic, and reliable understanding of accessibility impedes our ability to make informed and evidence-based decisions during both the planning and operation phases. In this dissertation, I introduce a new research framework – real-time accessibility – to better understand and improve public transit systems with high-resolution real-time geospatial data. The framework utilizes real-time information to assess different dimensions of system performance of public transit services and behaviors of passengers in the systems. I investigate four empirical questions with the framework based on the Central Ohio Transit Authority (COTA) bus system in Columbus, Ohio, a low-frequency bus system in a typical mid-size car-dependent city. First, I study the impact of new dockless scooter sharing services on local public transit systems. While they can significantly improve accessibility, equity and sustainability issues such as uneven distribution, high cost, and low capacity limit the collaboration between public transit and scooters. Second, I study the unreliability of public transit systems and the unrealistic assumptions made by schedule-based and retrospective-based measures used by rese (open full item for complete abstract)

    Committee: Harvey J. Miller (Advisor); Andre Carrel (Committee Member); Huyen Le (Committee Member); Ningchuan Xiao (Committee Member) Subjects: Geographic Information Science; Geography; Information Science; Sustainability; Transportation; Transportation Planning; Urban Planning
  • 3. Faries, Frank Big Data and the Integrated Sciences of the Mind

    PhD, University of Cincinnati, 2022, Arts and Sciences: Philosophy

    We live in a data-driven world. The brain and mind are being touted as the next target for Big Data. My project here is to evaluate the prospects of a data-driven cognitive science. The central hypothesis of this work is that Big Data heralds fundamental changes to the notion of integration in the philosophy of cognitive science. Although “integration” is seen as a central goal of Big Data research, there is, generally speaking, a tendency to conflate two different senses of integration: the technical issues of making data interoperable, or what we might call data integration, and the theoretical issues in evaluating disparate empirical evidence about the same phenomena, or what we will call information integration. This conflation motivates some of rhetoric surrounding Big Data, particularly in business domains. In cognitive science, as in data science, “integration” is also seen as a central goal, and there is good reason to believe that Big Data technologies may make positive contributions to this goal. One corollary of my central hypothesis is an affirmative response to this. Big Data will indeed contribute to “integration” in cognitive science, though it may not look the way philosophers might have expected. In support of this central hypothesis, I offer three observations on integration in Big Data and its intersection with cognitive science. Specifically: Data integration is presented as a key challenge in large-scale, data-intensive efforts to study the mind-brain. One way in which data integration is achieved is by means of an ontology. “Ontological realism” represents a family of theories about the proper means of data integration via ontologies. I argue here that, at best, what ontological realism demonstrates is a thesis about a set of ontological commitments. With respect to cognitive science, this commitment is neither warranted nor wanted as a normative constraint on data integration. Instead, I recommend a perspectivist approach to dat (open full item for complete abstract)

    Committee: Anthony Chemero Ph.D. (Committee Member); Angela Potochnik Ph.D. (Committee Member); Zvi Biener Ph.D. (Committee Member) Subjects: Philosophy
  • 4. DeHart, Clara “Doesn't Feel Warmer to Me”: Climate Change Denial and Fear in American Public Opinion

    Bachelor of Arts, Wittenberg University, 2020, Political Science

    Despite the scientific consensus that climate change is occurring, denial of this reality has persisted in the United States. While there are many possible explanations for this skepticism, one potential cause that has yet to be explored in detail is fear and its destabilizing influence on individuals' decision making processes. Prompted by concerns that addressing climate change will harm the economy, question free market ideology, and threaten the American way of life, it is argued in this paper that the emotional experiences prompted by these sources of fear can lead individuals to deny climate change. To test this hypothesis, National Election Studies survey data was used to gauge the covariation between climate denial and a variety of potential measures of fear. The results of these analyses demonstrate that both free market ideology and a desire to protect one's sense of American identity are associated with climate change denial, suggesting that these sources of fear must first be addressed in order to effectively communicate the risks of climate change to the American public.

    Committee: Staci Rhine (Advisor); James Allan (Committee Member); Sarah Fortner (Committee Member) Subjects: American Studies; Climate Change; Political Science; Psychology; Public Policy
  • 5. Kidd, Ian Object Dependent Properties of Multicomponent Acrylic Systems

    Master of Sciences (Engineering), Case Western Reserve University, 2014, Materials Science and Engineering

    Degradation of multi-component acrylic systems is becoming increasingly important as polymers and complex systems become commonplace in technological applications. For outdoor applications, understanding the interactions between each stressor and the optical, chemical, and mechanical response is important. This study focuses mainly on the magnitude and variance of optical and chemical properties of hardcoat acrylics on PET (Polyethylene terephthalate) or TPU (thermoplastic polyurethane) substrates, using big-data and unbiased statistics and analytics. PET shows a strong tendency to yellow and haze in accelerated and real-world exposures. A 0.90 correlation coefficient exists between yellowness and UVA-340 irradiance. A 0.8 correlation coefficient exists between haze and UVA-340 irradiance, but moisture must be present for hazing to occur. In TPU films, yellowing occurs until 200 MJ/m2 of UVA-340 irradiance, after which the films clear. Meanwhile, hardcoat acrylics with a TPU substrate are highly resistant to haze in all exposures studied. As optical degradation occurs up to 4000 hours of exposure, little correlation to carbonyl, C-H stretch, or N-H stretch area exists. A weak correlation is observed between increasing optical degradation and spectral attenuation, possibly indicating a complete breakdown of the polymers. Development of a model that relates observable degradation to surface and bulk phenomena can give insights into how to reduce degradation. All of this data must be used to focus the direction of R&D efforts to increase the useful lifetime of multi-component acrylic systems.

    Committee: Roger French (Advisor); James McGuffin-Cawley (Committee Member); Timothy Peshek (Committee Member); Laura Bruckman (Other); Olivier Rosseler (Other) Subjects: Engineering; Materials Science; Optics; Polymers
  • 6. Vishal, Bijendra A new broadcast cryptography scheme for emergency alert /

    Master of Science, The Ohio State University, 2005, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 7. Musgrave, John Addressing Architectural Semantic Gaps With Explainable Program Feature Representations

    PhD, University of Cincinnati, 2024, Engineering and Applied Science: Computer Science and Engineering

    This work explores the explainability of features used for classification of malicious binaries in machine learning systems based on semantic representations of data dependency graphs. This work demonstrates that explainable features can be used with comparable classification accuracy in real-time through non-parametric learning. This work defines operational semantics in terms of data dependency isomorphism, and quantifies the network structure of the graphs present in static features of binaries. This work shows that a bottom-up analysis holds across levels in the architectural hierarchy, and can be performed across system architectures. This work shows that semantic representations can be used for search and retrieval of malicious binaries based on their behavior. This work shows that unknown vulnerabilities can be predicted through descriptions of structure and semantics.

    Committee: Anca Ralescu Ph.D. (Committee Chair); Kenneth Berman Ph.D. (Committee Member); Alina Campan Ph.D M.A B.A. (Committee Member); Boyang Wang Ph.D. (Committee Member); Dan Ralescu Ph.D. (Committee Member) Subjects: Artificial Intelligence
  • 8. Labilloy, Guillaume Computational Methods For The Identification Of Multidomain Signatures of Disease States

    PhD, University of Cincinnati, 2024, Medicine: Biomedical Informatics

    The advent of sequencing technologies has revolutionized our understanding of disease. Researchers can now investigate the complex processes involved in the multi-layered transcription of genetic content, which regulates cell activity, homeostasis, and ultimately the organism's health. A disease can be conceived as a deviation from a homeostatic state, leading to cascading negative effects. A disease state, or more generally a disrupting factor (sometimes called a "perturbagen"), can be characterized by how it impacts the organism. This information constitutes its "signature", such as a list of differentially expressed genes or vectors of abundance of proteins or lipids. Significant efforts have focused on gathering these signatures into connectivity maps (CMAPs), which allow the identification of related disrupting factors based on the similarity of their signatures. CMAPs can overcome some limitations of traditional enrichment analysis. However, challenges remain. The integrative analysis of multi-domain data, as opposed to concurrent or sequential analysis, is still a challenge. The complexity of multi-omics analysis, involving retrieving datasets, annotations, and applying analytical pipelines, requires advanced programming skills, which can be a barrier for researchers without dedicated resources. Additionally, analysis pipelines need to scale up as assays become clinically available and more data is generated. To address these challenges, we developed machine learning tools to predict health outcomes, ranging from sepsis to dementia. Our goal is to build knowledge and expertise about integrative and extensible analytical pipelines for clinical, transcriptomics, and proteomics data. Specifically, we developed a statistical and machine learning model to classify patients by phenotype and predict mortality risk. We analyzed a prospective cohort of sepsis patients, selected predictive features, built and validated models, and then refined a robust model u (open full item for complete abstract)

    Committee: Jaroslaw Meller Ph.D. (Committee Chair); Michal Kouril Ph.D. (Committee Member); Robert Smith M.D. Ph.D. (Committee Member); Faheem Guirgis Ph.D M.A B.A. (Committee Member); Michael Wagner Ph.D. (Committee Member) Subjects: Bioinformatics
  • 9. Schuetz, Robert From Data to Diagnosis: Leveraging Algorithms to Identify Clinically Significant Variation in Rare Genetic Disease

    Doctor of Philosophy, The Ohio State University, 2024, Biomedical Sciences

    This dissertation addresses the critical need for scalable variant interpretation in the diagnosis of rare genetic diseases (RGDs) by developing and validating novel computational methods for the interpretation of genome sequencing (GS) data. We introduce Clinical Assessment of Variant Likelihood Ratios (CAVaLRi), a robust algorithm that uses a modified likelihood ratio framework to prioritize diagnostic germline variants. CAVaLRi effectively integrates phenotypic data with variant impact predictions and family genotype information to achieve superior performance over existing prioritization tools in multiple clinical cohorts. CAVaLRi-informed reanalysis was able to uncover eight diagnoses in a cohort of RGD patients who had previously received non-diagnostic GS test results, demonstrating utility in ending diagnostic odysseys. Complementing CAVaLRi, we developed CNVoyant, an advanced tool designed to classify and prioritize copy number variants (CNVs) by incorporating machine learning techniques with genomic features to identify disease causal CNVs. CNVoyant's integration into the CAVaLRi framework allows for a unified approach to handle multiple variant types, thus providing a comprehensive solution for genetic diagnostics. The combined utility of CAVaLRi and CNVoyant offers significant improvements in diagnostic yield and accuracy, facilitating timely and precise genetic diagnosis in clinical settings. These tools represent a scalable approach to meet the growing demands of GS testing, thereby expediting the diagnostic process for patients with undiagnosed RGDs and supporting the broader application of genomics in personalized medicine.

    Committee: Peter White (Advisor); Bimal Chaudhari (Advisor); Elaine Mardis (Committee Member); Alex Wagner (Committee Member); James Blachly (Committee Member) Subjects: Bioinformatics; Biomedical Research; Genetics
  • 10. Pillutla Venkata Sathya, Rohit Pneumonia Detection using Convolutional Neural Network

    MS, University of Cincinnati, 2023, Education, Criminal Justice, and Human Services: Information Technology

    Pneumonia is a disease that affects the lungs and is seasonal. It is also one of the complications of Covid-19. Diagnosing pneumonia requires doctors to use a chest x-ray of the infected patient. Early diagnosis of this disease would reduce the risks and complications associated with the disease. Many deep-learning algorithms have been used on chest x-ray images to classify them into infected x-rays and routine x-rays. We have done secondary research using the MobileNetV2 model to compare Erdem and Aydin's study and prove that MobileNetV2, along with data augmentation and other data preprocessing techniques, a higher accuracy of 92.47% can be achieved. We have also used Google Colab, a virtual machine used for virtual Machine Learning and Big Data services. In this research, we have also proved the statistical significance of this research using a test of hypotheses formula.

    Committee: Nelly Elsayed Ph.D. (Committee Member); Chengcheng Li Ph.D. (Committee Chair) Subjects: Information Technology
  • 11. Fraser, Kimberly DETERMINING STRUCTURE AND GROWTH CHARACTERISTICS OF OXIDE HETEROSTRUCTURES THROUGH DEPOSITION AND DATA SCIENCE: TOWARDS SINGLE CRYSTAL BATTERIES

    Doctor of Philosophy, Case Western Reserve University, 2023, Materials Science and Engineering

    A deeper understanding of processing-structure relationships has been developed with the goal of building single crystal devices using pulsed laser deposition (PLD) and advancing the application of data science to materials science. The targeted device was a half-cell lithium-ion battery, where strontium ruthenate (SRO) is the current collector, lithium cobalt oxide (LCO) is the cathode, and lithium lanthanum titanate (LLTO) is the electrolyte. These were grown on a strontium titanate (STO) substrate. Through studies of the processing parameters and film characteristics, conditions to grow a single crystal LCO/SRO/STO heterostructure were revealed. While the addition of the electrolyte affected the single crystal structure and interfacial quality, underlying reasons have been illuminated to guide further development of multi-layer oxide heterostructures. An in-situ technique called reflection h igh energy electron diffraction (RHEED) is commonly coupled with PLD to provide information on structure-property relationships by recording the diffraction pattern of the film during growth. Traditionally, a small percentage of the data provided is used in analysis. Here data science techniques are applied, both supervised and unsupervised, to reveal additional information from the full data set. As a result, the sensitivity of the length of diffraction spots over other parameters (e.g., width or intensity) to growth characteristics has been uncovered, especially in later stages of growth where the data is dominated by the reflection from the film. Additionally, through unsupervised learning, a phase shift in the intensity oscillations of different RHEED spots was uncovered. Non-negative matrix factorization among other techniques was used to deconvolute information from different diffraction spots. It was revealed that (01) and (0-1) spots are better indicators of thin film growth characteristics especially in material systems that grow in layer-by-layer or step-flow mechan (open full item for complete abstract)

    Committee: Alp Sehirlioglu Dr. (Advisor); Xuan Gao Dr. (Committee Member); Roger French Dr. (Committee Member); Frank Ernst Dr. (Committee Member) Subjects: Chemical Engineering; Chemistry; Computer Science; Engineering; Materials Science; Statistics
  • 12. Srinivasan, Sandeep Data Science Approaches for Designing Tailored Local Processing Conditions during Additive Manufacturing

    Doctor of Philosophy, The Ohio State University, 2022, Industrial and Systems Engineering

    Additive Manufacturing (AM), more commonly known as 3D printing, involves the construction of intricate and complicated 3D parts in a layer-by-layer manner and has undergone advancements in recent times with respect to the range of materials used to manufacture the part and the complexity of the part being printed. Laser powder bed fusion (LPBF) additive manufacturing (AM) which is the focal point of this work, has sparked a lot of interest in the materials and manufacturing community, especially for metals, because of the speed of production of the final part and the ability to print more convoluted component designs and assemblies without raising the production costs. However, a complex processing space results in heterogeneities in the thermal histories of the manufactured part which could impact the benefits of LPBF. Hence there is a need to develop a methodology to expedite searching for the process parameter sets in the complex processing space efficiently. The aim of the first part of this manuscript is to minimize the heterogeneity in the additively manufactured parts along with the reduction of the printing time for the parts. A reference was chosen which consisted of fairly homogenous conditions throughout and scan parameter sets for other objects in consideration were tuned so as to come close to the reference conditions and to have an optimized local processing structure throughout. To achieve this, a procedure is developed by coupling physics-based process modeling with machine learning and optimization methods to accelerate searching the AM processing space for suitable printing parameter sets. The technique developed is first applied to a few small but varied basic geometries. Optimized scan parameter sets for each part geometry is developed which have similar local processing structure along with the time to print the part is reduced which is desirable in the manufacturability aspects in the Additive Manufacturing space. For the second part o (open full item for complete abstract)

    Committee: Michael Groeber (Advisor); Steve Niezgoda (Committee Member); Theodore Allen (Committee Member) Subjects: Artificial Intelligence; Computer Science; Industrial Engineering; Materials Science
  • 13. Vaidya, Pranjal Multimodal Image Classifiers for Prognosis and Treatment Response Prediction for Lung Pathologies

    Doctor of Philosophy, Case Western Reserve University, 2022, Biomedical Engineering

    Non-small cell lung cancer tumors follow an orderly progression from adenocarcinoma in situ (AIS) to minimally invasive carcinoma (MIA) and invasive adenocarcinoma (INV). Currently, there is no definite biomarker to access the level of invasion and detect invasive disease in these early lepidic lesions using radiographic scans, which would ideally help in surgery planning for these patients. Within the early-stage NSCLC cohort, while all the patients will receive the surgery, a significant portion of patients (up to 50\%) will develop recurrence. Although most of these patients are eligible to receive adjuvant chemotherapy (chemo), not all patients will receive the added benefits. In the more advanced NSCLC setting, immunotherapy (IO) has shown promising survival improvement, but only a fraction (20\%) of patients will respond to IO, and a fraction of patients (8\%) would, in fact, receive adverse effects of it, and cancer would spread rapidly (hyperprogression). Most of the current AI methods developed in this field are based on a single modality. However, information across different modalities and scales may hold complementary information, and integrating them together may enhance the performance of AI models. In addition, most of the developed AI models lack interpretability, an essential element for successfully transitioning these AI methods into clinical practices. In this dissertation, we introduced new interpretable AI biomarkers that use textural patterns on radiographic scans, known as Radiomics, and combine these biomarkers across multiple modalities and scales for NSCLC and COVID-19 patients. The Radiomic features were analyzed from inside the tumor region as well as from the area immediately surrounding the nodule. Furthermore, we integrated the clinical features into Radiomics Model by using novel techniques. We also created a human-machine integrated model using Radiologists' scores combined with Radiomic Analysis. Lastly, we used pathology data (open full item for complete abstract)

    Committee: Anant Madabhushi (Advisor) Subjects: Biomedical Engineering; Biomedical Research
  • 14. Timothy, Stephen Data-Driven Analysis and Validation of Refrigeration in United States Commercial Buildings

    Master of Sciences (Engineering), Case Western Reserve University, 2022, EMC - Mechanical Engineering

    In this study, we refined and validated a method of determining the average refrigeration load in US commercial buildings using a virtual energy audit software called EDIFES. Crucial assumptions made in the analysis were investigated and validated, using variants of the preexisting code. Validation occurred through two methods: a statistics-based population study and a case study. The population study compared a sample of 32 buildings EDIFES analyzed against a group of 129 CBECS buildings based on energy use intensity (EUI). The Wilcoxon rank-sum test was used to determine if there was a statistically significant difference between the two populations. This statistics-based method of validation using the CBECS dataset revealed two key findings. First, it was determined that this analysis of determining refrigeration load should only be performed on food sales buildings. Second, the analysis revealed that the 3 Hour Variant of the refrigeration marker with 60% run time correction factor performed the best, with a p-value of 0.977, meaning there is very little evidence to reject the statement that the two populations have equal medians. The case study approach to validation entailed selecting a food sales building on the campus of CWRU, submetering plug-in refrigeration loads, and using an engineering manual (physics-based analysis) to estimate the load of walk-in units. This analysis demonstrated that, on average, EDIFES effectively captured 94% of the refrigeration load of the building.

    Committee: Brian Maxwell PhD (Committee Chair); Stephen Hostler PhD (Committee Member); Alexis Abramson PhD (Committee Member); Roger French PhD (Committee Member) Subjects: Energy; Engineering
  • 15. Gruenberg, Rebecca Multi-Model Snowflake Schema Creation

    Master of Computer Science, Miami University, 2022, Computer Science and Software Engineering

    Big Data's three V's--volume, velocity, and variety--have continually presented a problem for storing and querying large, diverse data efficiently. Data lakes represent a growing field of study to store large volumes of data in a variety of formats. Multi-model star schemas support analytical processing of data stored in native formats and are an emerging area in data warehousing. Using multi-model snowflake schemas in place of star schemas gives the user a bigger picture of the data lake and the relationships within. In this work, we extend and implement a meta-model for data lakes and provide an algorithm to semi-automatically perform mappings between the data lake and multi-model snowflake schema for structured and semi-structured data. Our algorithm recommends candidate multi-model snowflake schemas derived from a meta-model of a data lake. The algorithm is the basis for a tool to assist analysts in understanding the contents of a data lake and in creating views that support analytical processing to better make business decisions when querying a large data repository. We implement this basis for a tool and demonstrate its functionality using a variety of case studies.

    Committee: Karen Davis (Advisor); Dhananjai Rao (Committee Member); Daniela Inclezan (Committee Member) Subjects: Computer Science
  • 16. Brown, Kyle Topological Hierarchies and Decomposition: From Clustering to Persistence

    Doctor of Philosophy (PhD), Wright State University, 2022, Computer Science and Engineering PhD

    Hierarchical clustering is a class of algorithms commonly used in exploratory data analysis (EDA) and supervised learning. However, they suffer from some drawbacks, including the difficulty of interpreting the resulting dendrogram, arbitrariness in the choice of cut to obtain a flat clustering, and the lack of an obvious way of comparing individual clusters. In this dissertation, we develop the notion of a topological hierarchy on recursively-defined subsets of a metric space. We look to the field of topological data analysis (TDA) for the mathematical background to associate topological structures such as simplicial complexes and maps of covers to clusters in a hierarchy. Our main results include the definition of a novel hierarchical algorithm for constructing a topological hierarchy, and an implementation of the MAPPER algorithm and our topological hierarchies in pure Python code as well as a web app dashboard for exploratory data analysis. We show that the algorithm scales well to high-dimensional data due to the use of dimensionality reduction in most TDA methods, and analyze the worst-case time complexity of MAPPER and our hierarchical decomposition algorithm. Finally, we give a use case for exploratory data analysis with our techniques.

    Committee: Derek Doran Ph.D. (Advisor); Michael Raymer Ph.D. (Committee Member); Vincent Schmidt Ph.D. (Committee Member); Nikolaos Bourbakis Ph.D. (Committee Member); Thomas Wischgoll Ph.D. (Committee Member) Subjects: Computer Science
  • 17. Feige, Jonathan Use of Somatic Mutations for Classification of Endometrial Carcinomas with CpG Island Methylator Phenotype

    Master of Science (MS), Ohio University, 2022, Electrical Engineering & Computer Science (Engineering and Technology)

    Endometrial carcinoma begins in the cells within the inner lining of the uterus that, like many other cancers, grows out of control. A subset of these tumors shows genome wide hypermethylation. Hypermethylation results in down-regulation of tumor suppressor genes resulting in CpG Island Methylator Phenotype (CIMP) tumors. Individuals with this hypermethylation are classified as CIMP+ and have increase in cancer reproduction and growth. We have hypothesized that by using CIMP related samples and the mutations associated with them, we can classify with high accuracy, an unknown sample using only mutational data. Using machine learning, we found that it is possible to correctly classify unknown CIMP samples with 90% accuracies just using the somatic mutations within each sample. This breakthrough will be used for diagnostics and treatment of endometrial cancers.

    Committee: Lonnie Welch (Advisor); Kevin Lee (Committee Member); Avinash Karanth (Committee Member); Chad Mourning (Committee Member) Subjects: Bioinformatics; Biology; Computer Science
  • 18. Clunis, Julaine Semantic Analysis Mapping Framework for Clinical Coding Schemes: A Design Science Research Approach

    PHD, Kent State University, 2021, College of Communication and Information

    The coronavirus disease 2019 (COVID-19) pandemic has revealed challenges and opportunities for data analytics, semantic interoperability, and decision making. The sharing of COVID-19 data has become crucial for leveraging research, testing drug effectiveness and therapeutic strategies, and developing policies for control, intervention, and potential eradication of this disease. Translating healthcare data between various clinical coding schemes is critical to their functioning, and semantic mappings must be established to ensure interoperability. Using design science research methodology as a guide, this work explains 1) how an ETL (Extract Transform Load) workflow tool could support the task of clinical coding scheme mapping, 2) how the mapping output from such a tool could support or affect annotation of clinical trials, particularly those used in COVID-19 research and 3) whether aspects of the socio-technical model could be leveraged to explain and assess mapping to achieve semantic interoperability in clinical coding schemes. Research outcomes include a reproducible and shareable artifact, that can be utilized beyond the domain of biomedicine in addition to observations and recommendations from the knowledge gained during the design and evaluation process of the artifact development.

    Committee: Marcia Zeng (Advisor); Athena Salaba (Committee Member); Mary Anthony (Committee Member); Yi Hong (Committee Member); Rebecca Meehan (Committee Member) Subjects: Bioinformatics; Information Science
  • 19. Synakowski, Stuart Novel Instances and Applications of Shared Knowledge in Computer Vision and Machine Learning Systems

    Doctor of Philosophy, The Ohio State University, 2021, Electrical and Computer Engineering

    The fields of computer vision and machine learning have made enormous strides in developing models which solve tasks only humans have been capable of solving. However, the models constructed to solve these tasks came at an enormous price in terms of computational resources and data collection. Motivated by the sustainability of continually developing models from scratch to tackle every additional task humans can solve, researchers are interested in efficiently constructing new models for developing solutions to new tasks. The sub-fields of machine learning devoted to this line of research go by many names. Such names include multi-task learning, transfer learning, and few-shot learning. All of these frameworks use the same assumption that knowledge should be shared across models to solve a set of tasks. We define knowledge as the set of conditions used to construct a model that solves a given task. By shared knowledge, we are referring to conditions that are consistently used to construct a set of models which solve a set of tasks. In this work, we address two sets of tasks posed in the fields of computer vision and machine learning. While solving each of these sets of tasks, we show how each of our methods exhibits a novel implementation of shared knowledge leading to many implications for future work in developing systems that further emulate the abilities of human beings. The first set of tasks fall within the sub-field of action analysis, specifically the recognition of intent. Instead of a data-driven approach, we construct a hand-crafted model to infer between intentional/non-intentional movement using common knowledge concepts known by humans. These knowledge concepts are ultimately used to construct an unsupervised method to infer between intentional and non-intentional movement across levels of abstraction. By layers of abstraction we mean that the model needed to solve the most abstract instances of intent recognition, is useful in developing models whi (open full item for complete abstract)

    Committee: Aleix Martinez (Advisor); Abhishek Gupta (Committee Member); Yingbin Liang (Committee Member) Subjects: Artificial Intelligence; Computer Engineering; Computer Science
  • 20. Alreshidi, Bader Using a Machine Learning Approach to Predict Healthcare Utilization and In-hospital Mortality among Patients with Acute Myocardial Infarction

    Doctor of Philosophy, Case Western Reserve University, 0, Nursing

    Acute myocardial infarction (AMI) remains one of the most common causes of death in the United States and allocates a tremendous amount of healthcare expenditures that go beyond $12 billion annually. Machine learning (ML), a subset of artificial intelligence, has emerged as valuable methodological tool to advance nursing and biomedical research. The ML has shown to build predictive models that allow detection of risk factors, assist in diagnosis, and propose personalized treatments plans that may lead to enhanced patients' outcomes. From this perspective, the purpose of this study was to develop and evaluate the predictive performance of a random forest machine-learning model with a conventional multivariate logistic regression model established to examine the influence of individuals predisposing, enabling, and need factors on health service use outcomes (in-hospital mortality, 30-day readmission) of AMI patients guided by the Andersen Model (2008). The cross-sectional retrospective study utilized the Medical Information Mart for Intensive Care which comprises patients admitted to a large tertiary-level academic center of the Harvard Medical School in Boston, MA. The variables of interest include age, gender, ethnicity, type of insurance, body mass index, existing comorbidities, in-hospital mortality, 30-day readmission. Patients with a primary diagnosis of ST-segment elevation MI and non-ST-segment elevation MI cared for at emergency department or critical care units were included in the study. Predictive models for each health service use outcomes were built using RStudio. There were a total of 1171 AMI patients included in the study, 255 (21.8%) patients with STEMI and 916 (78.2%) patients with NSTEMI. Predictors of in-hospital mortality and 30-day readmission included age and existing comorbidities. The accuracy rate, sensitivity, specificity, and AUC for the random forest models were 68-75%, 72-81%, 41-50%, and 0.58-0.59, respectively. On the other hand, the a (open full item for complete abstract)

    Committee: Ronald Hickman Jr., PhD, RN, ACNP-BC, FNAP, FAAN (Committee Chair); Mary Dolansky PhD, RN, FAAN (Committee Member); Nicholas Schiltz PhD (Committee Member); Richard Josephson MD, MS, FACC, FAHA, FACP, FAACVPR (Committee Member) Subjects: Nursing