Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 312)

Mini-Tools

 
 

Search Report

  • 1. SUN, HAN High-dimensional Variable Selection: A Novel Ensemble-based Method and Stability Investigation

    Doctor of Philosophy, Case Western Reserve University, 2025, Epidemiology and Biostatistics

    Variable selection in high-dimensional data analysis poses substantial methodological challenges. While numerous penalized variable selection methods and machine learning approaches exist, many demonstrate instability in real-world applications. This thesis makes two primary contributions: developing a novel ensemble algorithm for variable selection in competing risks modeling and conducting a comprehensive stability analysis of established variable selection methods. The first component introduces the Random Approximate Elastic Net (RAEN), an innovative methodology that offers a stable and generalizable solution for large-p-small-n variable selection in competing risks data. RAEN's flexible framework enables its application across various time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We demonstrate that our computationally-intensive algorithm substantially improves both variable selection accuracy and parameter estimation in a numerical study. We have implemented RAEN in a user-friendly R package, freely available for public use. To demonstrate its practical utility, we apply RAEN to a cancer study, successfully identifying influential genes associated with mortality and disease progression in bladder cancer patients. The second component comprises a systematic evaluation of eight variable selection methods' stability under varying conditions. Through comprehensive numerical studies, we examine how factors such as sample sizes, number of predictors, correlation levels, and signal strength influence performance. Based on these findings, we provide evidence-based recommendations for implementing variable selection methods in real-world data analysis.

    Committee: Xiaofeng Wang (Advisor); John Barnard (Committee Member); Mark Schluchter (Committee Member); William Bush (Committee Chair) Subjects: Bioinformatics; Biostatistics; Genetics; Statistics
  • 2. Gao, Suyang Power Calculations in Meta Analysis

    PhD, University of Cincinnati, 2024, Medicine: Biostatistics (Environmental Health)

    Meta-analysis is a powerful statistical technique applied in research and evidence synthesis. It involves combining results from multiple independent studies to draw overall conclusions. The effect size measurement is the primary statistic that the researchers interested in. Another approach is to combine the results of several studies addressing the same testing of hypothesis problem, putting emphasis on p-values emanating from the individual studies. Combining p-values require thoughtful consideration of statistical properties, alternative hypotheses, and the specific context of the analysis. Suppose we have the p-values obtained from m independent hypothesis tests. Under the null hypothesis, we assume that the underlying test statistics have continuous probability distributions, and the corresponding p-value always follows the uniform distribution. Under the alternative hypothesis, the pdf is influenced by both the sample size and effect size of the underlying study. Determining the distribution of the combination test statistic of p-values is very complex. The main thrust of my research is to provide guidelines on the number of m of studies to be included in the meta-analysis work. The determination of number of m depends on the individual sample sizes of the studies, the alternative hypothesis (i.e., effect size), the significance level, and the power provided. We primarily discuss the two classic tests by Tippett and Fisher, comparing their performance using p-values from different underlying hypothesis tests. Additionally, we assess the performance of three tests based on natural combining statistics: geometric mean, arithmetic mean, and harmonic mean.

    Committee: Marepalli Rao Ph.D. (Committee Chair); Roman Jandarov Ph.D. (Committee Member); Jeffrey Welge Ph.D. (Committee Member) Subjects: Biostatistics
  • 3. Li, Dandan Patient-Centered Model for Predicting Distant Metastasis in Breast Cancer: Insights from the 2021 National Inpatient Sample (NIS)

    MS, University of Cincinnati, 2024, Medicine: Biostatistics (Environmental Health)

    Abstract Background: Breast cancer remains a leading cause of cancer-related morbidity and mortality in women worldwide. Traditional methods for predicting metastasis in breast cancer rely primarily on tumor pathology characteristics, such as tumor size, TNM grade, and receptor status. However, these methods do not fully account for patient-centered health factors, which could also play a role in metastasis risk. Factors such as a patient's overall physiological health, history of anti-neoplastic treatments, and personal and family history of cancer may also significantly impact the likelihood of developing distant metastasis in breast cancer. This study aims to develop a predictive model for distant metastasis in breast cancer that incorporates these broader, patient-centered factors for a more comprehensive risk assessment. Methods: This study analyzed all 4296 female breast cancer cases from the 2021 NIS, assessing 130 variables. Among these cases, 1691 (39.36%) had distant metastasis, while 2605 (60.64%) did not. For metastasis prediction, 21 key variables were selected, including age, race, anti-neoplastic treatment, presence of other cancers, cancer history, smoking, depression/anxiety, elective admission, All Patient Refined DRG Severity of Illness Subclass (APRDRG Severity), and various comorbidities. A binary logistic regression model was developed to build the predictive model for distant metastasis in breast cancer, and refined through backward elimination with cross-validation used for validation. Additionally, eight additional variables, such as morbidity, length of stay, and total charges were analyzed for comparison but were not included in the predictive model. Statistical comparisons between metastatic and non-metastatic groups were conducted, with continuous variables assessed using the Mann-Whitney U test and categorical variables using the Chi-Square test or Fisher's Exact test. The significance level (a) was set at 0.05. All analy (open full item for complete abstract)

    Committee: Roman Jandarov Ph.D. (Committee Member); Marepalli Rao Ph.D. (Committee Chair) Subjects: Biostatistics
  • 4. Meng, Guanqun STATISTICAL CONSIDERATIONS IN CELL TYPE DECONVOLUTION AND CELL-TYPE-SPECIFIC DIFFERENTIAL EXPRESSION ANALYSIS

    Doctor of Philosophy, Case Western Reserve University, 2024, Epidemiology and Biostatistics

    Interpreting sequencing data precisely is often the primary task in genomic research, aiming to uncover gene expression alterations associated with various phenotypes. Biopsy or tissue samples collected in clinical and research settings are typically a mosaic of at least several pure cell types. The observed changes in gene expression could be caused by variations in cell type compositions or differentially expressed (DE) genes within specific cell types. Therefore, cellular deconvolution is a critical step before the cell-type-specific Differentially Expressed (csDE) gene study. Many statistical approaches have been proposed for csDE studies. However, a systematic review that examines the assumptions underlying these models and how these assumptions influence their performances under different scenarios has not yet been conducted. Additionally, there is a lack of statistical tools to assess the powers of csDE studies. Furthermore, current deconvolution methods largely depend on the assumption that all subjects share an identical population-level reference panel, which ignores inter-subject heterogeneities. This may compromise the validity of results, especially in studies that involve repetitive and longitudinal measurements. Moreover, while machine learning and deep learning-based deconvolution methods have been extensively developed for bulk transcriptomic data such as RNA-seq and microarrays, their application to imaging data, such as Immunohistochemistry (IHC), remains unexplored. We first benchmarked a few popular statistical models for detecting csDE genes between different phenotype-of-interests. Based on our comprehensive and flexible data simulation pipelines, we developed a power evaluation toolbox, cypress, to guide researchers in designing experiments for csDE studies. cypress can conduct extensive simulations using existing or provided parameters, model biological/technical variations, and provide thorough assessments by multiple metrics. Additio (open full item for complete abstract)

    Committee: Hao Feng (Advisor); Fredrick R. Schumacher (Committee Chair); Qian Li (Committee Member); Jenný Brynjarsdóttir (Committee Member); Lijun Zhang (Committee Member) Subjects: Bioinformatics; Biostatistics; Genetics; Public Health; Statistics
  • 5. Ibrahim-Ojoawo, Atinuke The Evaluation of HPV Vaccination Among Adolescents and Adults in United States: Assessment of Sociodemographic Disparities and Misinformation on Social Media Platforms

    Doctor of Philosophy in Health Sciences, Youngstown State University, 2024, Department of Graduate Studies in Health and Rehabilitation Sciences

    The HPV vaccination has been effective in preventing HPV-related diseases and cancers, but a large portion of the US population remains unvaccinated. The dissertation explores the sociodemographic disparities in HPV vaccination uptake and coverage in different categories of the US population. Additionally, it investigates the nature and pattern of HPV misinformation and its relationship with extremism, conspiracism, and religious ideologies in online forums. The first two studies used national survey data to analyze HPV vaccine initiation and completion among US adolescents and adults. Descriptive statistics and logistic regression were adopted to estimate disparities in HPV vaccination uptake. The third study analyzed online forum posts related to HPV vaccination discourse and misinformation through purposive sampling, utilizing automated text mining and statistical analysis to assess the association with conspiracy theories, extreme ideologies, and extremist activities. HPV vaccine initiation and completion increased among older adolescents, and there are lower odds of HPV vaccination among adults aged 45 and above. The nature of discussions about HPV vaccination in online forums is primarily analytical, with moderate to low clout and emotional tone. Discourse and misleading narratives about the HPV vaccine among faith communities, including the prevalence of extreme theories and ideologies, potentially escalate into extremist activities. There is a need for initiatives to address HPV vaccination uptake among boys and expand the vaccine recommendations to include all US adults. Online engagement of community forums, experts, and evidence-based communication to promote HPV vaccine uptake.

    Committee: Nicolette Powe PhD (Advisor); Ken Learman PhD (Committee Member); Richard Rogers PhD (Committee Member); Heather Hefner PhD (Committee Member) Subjects: Biostatistics; Demographics; Health; Health Sciences; Public Health; Public Health Education
  • 6. Kuang, Zhanpeng A comparison of multiplicity adjustment methods for three-arm treatment plus trials

    Master of Science, The Ohio State University, 2024, Public Health

    Multi-arm trials are common among randomized controlled trials. These parallel-group trials compare three or more interventions, usually to a shared control. Our simulation study focuses on a specific multi-arm structure – one that involves three arms: control, treatment, and treatment plus. Due to the potential number of comparisons that investigators can conduct under this three-arm structure, the issue of multiple comparisons must be considered. For a three-arm treatment plus trial, seven methods were explored: Bonferroni, Holm, Hochberg, fixed sequence, hierarchy, Dunnett, and Prospective Alpha Allocation Scheme (PAAS). To directly compare these procedures with one another, a simulation study was conducted to determine which method controlled FWER at α = 0.05 while maximizing power. Power for each method was calculated as sample sizes, allocation schemes, prevalences and effect sizes were changed. Overall, we saw that all seven methods could maintain the expected α = 0.05 type I error rate. Dunnett generally performed the worst and was not recommended as a solution. Bonferroni, Holm, Hochberg, and PAAS were comparable in power while fixed sequence and hierarchy varied depending on allocation ratio.

    Committee: Rebecca Andridge (Committee Member); Abigail Shoben (Advisor) Subjects: Biostatistics
  • 7. Luu, Hoang What Will Our Forests Look Like in the Future? Modeling Regeneration Dynamics and Their Effects on Species Composition and Management Practices Under Climate Change

    Doctor of Philosophy (PhD), Ohio University, 2024, Plant Biology (Arts and Sciences)

    This dissertation enhances a forest gap model (ForClim) by incorporating seed production and seedling establishment processes, addressing a critical gap in understanding forest regeneration under climate change. The regeneration of forests in the Pacific Northwest (PNW) is a key driver of biodiversity, shaping species composition and ecosystem structure, and climate change is expected to significantly alter these processes, leading to shifts in both biodiversity and timber productivity. Simulations in this study revealed that seedling survival plays a more critical role than seed production in determining future species composition, particularly as climate variability increases. Resilient species like Pseudotsuga menziesii and Pinus ponderosa may sustain or increase their dominance, while species such as Abies grandis and Tsuga mertensiana face declines due to reduced seedling survival. Additionally, current forest management practices may need adjustment, with "no management" maximizing harvest volume for Coastal Douglas fir, while Mountain Douglas fir may experience reduced yields under future extreme climate scenarios. These findings highlight the importance of integrating regeneration processes into forest models to predict forest biodiversity and timber industry outcomes.

    Committee: Rebecca Snell (Advisor) Subjects: Applied Mathematics; Bioinformatics; Biology; Biostatistics; Ecology; Environmental Management; Environmental Science; Environmental Studies; Natural Resource Management; Plant Biology
  • 8. Angeles, David Algorithmic Estimation for Partially Observed Functions and Shapes

    Doctor of Philosophy, The Ohio State University, 2024, Biostatistics

    The rise of high-dimensional data collection has spurred the development of advanced statistical methods to address complex issues across various fields such as biology, anthropology, neuroscience, and environmental sciences. However, dealing with high-dimensional missing data remains a significant challenge. Partially observed functional and shape data are types of high-dimensional data that exhibit missingness. Traditional methods often fall short in handling incomplete functional and shape data effectively, necessitating the ongoing development of robust methods for complete estimation based on partial observations. Further, functional and shape data are typically analyzed using geometric statistical approaches. We present algorithms for estimation of the completion of partially observed functions and shapes under a unified framework, where shapes are considered as functional data. We find that the completion algorithm can adequately recover the underlying trajectory of a function or shape. Further, the completion estimates demonstrate high accuracy when classifying real and simulated data for a large range of missingness percentages for functions and shapes. Our unified framework offers a novel approach to missing data estimation and classification, bridging a gap in the current literature.

    Committee: Sebastian Kurtek (Advisor); Oxana Chkrebtii (Committee Member); Abigail Shoben (Committee Member); Kellie Archer (Committee Member) Subjects: Biostatistics
  • 9. Kramer, Benjamin The Impact of Proteoglycans on Ascending Aortic Dissection Mechanics

    Doctor of Philosophy, Case Western Reserve University, 2024, Clinical Translational Science

    Ascending aortic dissection is a surgical emergency involving the proximal aorta with an incidence of between 5 and 30 cases per million persons per year and an estimated mortality of 20% within 24 hours. Worryingly, mortality after dissection increases 1–2% per hour following symptom onset. Aortic dissection is closely associated with aortic aneurysm and thus, action to reduce the prevalence of aortic dissection has been primarily directed at improving the management of ascending aortic aneurysms. The morbidity and mortality associated with ascending aortopathy are direct results of the biomechanical dysfunction and failure of aortic tissue. Understanding the complex mechanical behavior of aortic tissue and the influence of microstructural components on its behavior may provide novel insights to better predict ascending aortic dissection and improve clinical decision making surrounding aortopathy. Proteoglycans are an important part of the extracellular matrix of the aorta, whose function is balancing tensile forces within tissue. Aggrecan, a proteoglycan, previously believed to be confined to cartilage tissue, has been identified in massive amounts in diseased aortic tissue. Although beneficial in normal quantities, excess accumulation of proteoglycans, such as aggrecan, may be associated with aortopathy and biomechanical dysfunction. The underlying hypothesis of this dissertation is that increased proteoglycan deposition is correlated with aortopathy-associated biomechanical dysfunction. Using a prospective translation study I demonstrate that: i) aggrecan is a sensitive biomarker of ascending aortopathy and elevated preoperative blood levels are independently associated with aortic disease, ii) blood aggrecan concentration is correlated with aortopathy-associated biomechanical dysfunction, assessed using ex vivo biomechanical testing methods corresponding with aortic dissection, and iii) increased proteoglycan deposition resulting i (open full item for complete abstract)

    Committee: Eugene Blackstone (Committee Chair); Suneel Apte (Committee Member); Robb Colbrunn (Committee Member); Eric Roselli (Advisor) Subjects: Anatomy and Physiology; Biomechanics; Biomedical Engineering; Biostatistics; Medicine; Surgery
  • 10. Cui, Zuxi ASSESSING GENETIC IMPUTATION QUALITY AND ITS APPLICATION IN RARE VARIANT ANALYSIS FOR GENOME-WIDE ASSOCIATION STUDIES: A SIMULATION-POWERED APPROACH

    Doctor of Philosophy, Case Western Reserve University, 2024, Epidemiology and Biostatistics

    This dissertation provides a study in the realm of genome-wide association studies (GWAS), with a specific focus on genetic imputation quality and its impact on uncommon and rare variant analysis. This research encompasses the development of an advanced whole-genome GWAS data simulation methodology using the SLiM program to enhance imputation and trait loci discovery. This approach marks a significant progression from previous methods limited to regional or chromosome-level simulations. Key aspects of this research includes the exploration of unique quality control metrics for genetic imputation, thus underscoring the influence of minor allele frequency (MAF) on imputation accuracy. This investigation is crucial for enhancing the precision of imputation techniques, especially for rare genetic variants, which are often challenging to analyze due to their low frequency. As an applied example, this research has led to the identification of three new loci marginally associated with prostate cancer (PrCa), contributing significantly to the understanding of its genetic architecture. This discovery has potential implications for future prostate cancer research and treatment strategies. Overall, this dissertation represents a substantial contribution to the field of genetic research in GWAS. It provides valuable evaluations of new imputation approaches and demonstrates their effectiveness in discovering new disease- related genetic variants, paving the way for future studies in cancer GWAS.

    Committee: Xiaofeng Zhu (Committee Chair); Fredrick Schumacher (Advisor); Thomas LaFramboise (Committee Member); Hao Feng (Committee Member); Jessica Cooke Bailey (Committee Member) Subjects: Biostatistics; Epidemiology; Genetics
  • 11. Yang, Li Sample Size Calculation in Simple Linear Regression under Two Scenarios

    PhD, University of Cincinnati, 2024, Medicine: Biostatistics (Environmental Health)

    Sample size determination is key to the success of any statistical investigation prior to exploring a causal relationship between a response variable Y and a single predictor X. A hypothesis about the relationship is presages sample size calculation. Several steps are involved in testing a hypothesis: data collection; building of a test statistic, Type I error control and power specification, are spelled out prior to the start of the investigation, and sample size is needed to meet the power requirement. Besides Type I error probability and power, the population's variance or conditional variance of the response may be needed. We focus on the sample size calculation in this disquisition from of a simple linear regression perspective. Both independent variable (X) and dependent variable (Y) are introduced in the regression model. The interest of the simple linear regression is whether X impacts Y. A sample size of n will be needed for the detection of the degree of dependence of Y on X. The significant level a (Type I error) and power 1-ß (ß Type II error) are pre-specified, besides effect size. We need data to estimate the regression coefficient from the conditional model by using the least square method. Software like PASS and nQuery take the wrong approach in calculating the required sample size. The error has been pointed out in the literature. This work falls under methodological research. The aim of our research is to contrast sample calculated under different but equivalent environments. The error was fixed in earlier research by deriving the distribution of the likelihood estimator of the regression coefficient when the predictor is taken to be normally distributed. We follow a different approach in this disquisition. We assume the and response has a bivariate normal distribution. Testing null correlation is equivalent to testing null slope. We calculated sample size under the testing environment of null correlation. Sample sizes thus calcula (open full item for complete abstract)

    Committee: Marepalli Rao Ph.D. (Committee Chair); Tesfaye Mersha Ph.D. (Committee Member); Roman Jandarov Ph.D. (Committee Member) Subjects: Biostatistics
  • 12. Lu, Wei-En Causal Inference in Case-Cohort Studies Using Restricted Mean Survival Time

    Doctor of Philosophy, The Ohio State University, 2024, Biostatistics

    In large observational epidemiological studies with survival outcome and low event rates, the stratified case-cohort design is commonly used to reduce the cost associated with covariate measurement. The goal of many of these studies is to determine whether a cause-and-effect relationship exists between some treatment and an outcome rather an associative relationship. Therefore, a method for estimating the causal effect under the stratified case-cohort design is needed. In this dissertation, we propose to estimate the causal effect of treatment on survival outcome using restricted mean survival time (RMST) difference as the causal effect measure under the stratified case-cohort design and using propensity score stratification or matching to adjust for confounding bias that is present in observational studies. First, we propose a propensity score stratified RMST estimation strategy under the stratified case-cohort design. We established the asymptotic normality of the proposed estimator. Based on the simulation study, the proposed method performs well and is simple to implement in practice. We also applied the proposed method to the Atherosclerosis Risk in Communities (ARIC) Study to estimate the marginal causal effect of high sensitivity C-reactive protein level on coronary heart disease survival. As an alternative to propensity score stratification, we proposed a propensity score matched RMST estimation strategy under the stratified case-cohort design. The asymptotic normality of the proposed estimator was established and due to the matching design, the correlation that exists within the matched set was accounted for. Simulation studies also demonstrated that the proposed method has adequate performance and outperforms the competing methods. The proposed method was also used to estimate the marginal causal effect of high sensitivity C-reactive protein level on coronary heart disease survival in the ARIC study.

    Committee: Ai Ni (Advisor); Eben Kenah (Committee Member); Bo Lu (Committee Member) Subjects: Biostatistics; Public Health; Statistics
  • 13. Zhang, Jing Variance estimation for dynamic functional connectivity

    Doctor of Philosophy, Case Western Reserve University, 2024, Epidemiology and Biostatistics

    Functional connectivity (FC) is the degree of synchrony of time series between distinct, spatially separated brain regions. While traditional FC analysis assumes the temporal stationarity throughout a brain scan, there is growing recognition that connectivity can change over time and is not stationary, leading to the concept of dynamic FC (dFC). Resting-state functional magnetic resonance imaging (fMRI) can assess dFC using the sliding window method with the correlation analysis of fMRI signals. Accurate statistical inference of sliding window correlation must consider the autocorrelated nature of the time series. Currently, the dynamic consideration is mainly confined to the point estimation of sliding window correlations. Using in vivo resting-state fMRI data, we first demonstrate the non-stationarity in both the cross-correlation function (XCF) and the autocorrelation function (ACF). Then, we propose the variance estimation of the sliding window correlation considering the nonstationary of XCF and ACF. This approach provides a means to dynamically estimate confidence intervals in assessing dynamic connectivity. Using simulations, we compare the performance of the proposed method with other methods, showing the impact of dynamic ACF and XCF on connectivity inference. We further apply our proposed method on two in vivo resting-state fMRI data, one for health subjects, one for tumor patients. We show the additional information can be obtained for statistical inference using our method. We also map temporal fluctuations of FC in brain tumor patients and look at the test-retest reliability. We demonstrate the feasibility of performing resting-state functional connectivity studies in intraoperative settings with high spatial-temporal resolution. Accurate variance estimation used in this analysis can help in addressing the critical issue of false positivity and negativity.

    Committee: Abdus Sattar (Committee Chair); Xiaofeng Zhu (Advisor); Curtis Tatsuoka (Committee Member); Douglas Martin (Committee Member); Stefan Posse (Committee Member) Subjects: Biostatistics
  • 14. Sharna, Silvia Enhancing Classification on Disease Diagnosis with Deep Learning

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2024, Data Science

    The use of statistical and machine learning methods in collection, evaluation and presentation of biological data is very extensive. This reflects a need for precise quantitative assessment of different types of challenges encountered in the field of healthcare. But the sparse nature of medical data makes it hard to find the hidden patterns and as a result makes the prediction a complex task. This dissertation research discusses several biostatistical methods including sample size determination in a balanced clinical trial, finding cohort risk from case control information, odds ratio, Cochran-Mantel-Haenszel odds ratio etc. along with examples and analysis of a real life dataset to further solidify the concepts. Moreover, different classification models: Random Forest, Gradient Boosting, Support vector Machine (SVM), Naive Bayes, K-Nearest Neighbors (KNN), Decision Tree (DT), Logistic Regression, Artificial Neural Network (ANN) are applied in the analysis of Wisconsin Breast Cancer (diagnostic and original) dataset and their performance comparison is presented. Later, these classification models are also used in conjunction with ensemble learning methods; since ensemble methods significantly improves the predictive outcomes of the classification models. The evaluation of the classification models is measured using accuracy, AUC score, precision and recall metrics. In tree-based classification models, Random Forest (solely and in conjunction with the ensemble learning) gives the highest accuracy; whereas in the later chapter Artificial Neural Network gives the highest accuracy measure.

    Committee: John Chen Ph.D. (Committee Chair); Mohammadali Zolfagharian Ph.D. (Other); Umar Islambekov Ph.D. (Committee Member); Qing Tian Ph.D. (Committee Member) Subjects: Biostatistics; Statistics
  • 15. Schauner, Robert O-GlcNAcylation and Response Prediction in Acute Myeloid Leukemia: A Data-Driven Approach

    Doctor of Philosophy, Case Western Reserve University, 2024, Pathology

    AML is the most common acute leukemia in adults with an overall poor prognosis and high relapse rate. Multiple factors including genetic abnormalities, differentiation defects and altered cellular metabolism contribute to AML development and progression. Though the roles of oxidative phosphorylation and glycolysis are defined in AML, the role of the HBP, which regulates the O-GlcNAcylation of cytoplasmic and nuclear proteins, remains poorly defined. We studied the expression of the key enzymes involved in the HBP in AML blasts and stem cells at the single-cell and bulk level. We found higher expression levels of the key enzymes in the HBP in AML as compared to healthy donors in whole blood. We also observed elevated OGT and OGA expression in AML stem and bulk cells as compared to normal HSPCs. Gene set analysis showed substantial enrichment of the NF-κB pathway in AML cells expressing high OGT levels. We found AML bulk cells and stem cells show enhanced OGT protein expression and global O-GlcNAcylation compared to normal HSPCs, validating our in-silico findings. Our study suggests the HBP may prove a potential target, alone or in combination with other therapeutic approaches, to impact both AML blasts and stem cells. Moreover, as insufficient targeting of AML stem cells by traditional chemotherapy is thought to lead to relapse, blocking HBP and O-GlcNAcylation in AML stem cells may represent a novel promising target to control relapse. Additionally, prognostic biomarker discovery approaches based upon bulk analysis are unable to capture key attributes of rare subsets of cells that play a critical role in patient outcomes. Single-cell RNA sequencing is a powerful technique that enables the assessment of rare subsets of cells, but this technique is not amenable to clinical diagnostics. One area where improved prognostic biomarkers are important is for the management of pediatric AML patients with a FLT3-ITD genetic abnormality. We utilized single-cell data from the ra (open full item for complete abstract)

    Committee: Brian Cobb (Committee Chair); David Wald (Advisor); Tae Hyun Hwang (Advisor); Stanley Huang (Committee Member); Li Lily Wang (Committee Member); Clive Hamlin (Committee Member) Subjects: Biostatistics; Immunology; Oncology
  • 16. Zhao, Ruochen Template Matching Methods for Causal Inference in Complex Observational Studies

    Doctor of Philosophy, The Ohio State University, 2024, Biostatistics

    Observational studies are pivotal for inferring causal effects in clinical and health research but often face challenges due to the lack of randomization, particularly due to both observed and unobserved confounders. Traditional matching designs, aimed at addressing observed confounders, struggle with several issues: the specific causal estimand of interest, variability in sample sizes across treatment groups, and the dynamics of time-varying treatments. My dissertation proposes a novel flexible matching design, inspired by template matching techniques, to overcome these challenges. This innovative approach not only adapts more effectively to the complexities of real-world data but also enhances our understanding of dynamic treatment effects over time, thereby providing a robust framework for causal inference in observational studies.

    Committee: Bo Lu (Advisor); Xinyi Xu (Committee Member); Ai Ni (Committee Member); Eben Kenah (Committee Member) Subjects: Biostatistics; Statistics
  • 17. Pourmohammadi, Mahsa The Effect Of Cognitive Load And Visuomotor Tracking On Speech Production

    Master of Science (MS), Bowling Green State University, 2024, Communication Disorders

    The purpose of this study was to examine the interaction between cognitive demands during speech production and concurrent performance of a visuomotor tracking task. Participants performed a working memory task involving embedding a numerical response in a carrier phrase. To modulate cognitive load, participants performed two speech task variants with different degrees of mental tracking effort. For the low-demand variant, participants completed the carrier phrase by counting forward from one, a task that is relatively simple and considered automatic. For the high-demand variant, participants completed the carrier phrase by performing serial subtraction by three, requiring a modest amount of mental tracking effort. Both tasks were performed in isolation and while performing a concurrent visuomotor tracking task. Concurrent serial subtraction led to a reduction in visuomotor tracking accuracy, whereas counting forward did not affect tracking accuracy. Compared to counting forward, serial subtraction was associated with a decrease in speech intensity, lip opening and closing range, and lower lip opening and closing velocities. Compared to speaking insolation, participants exhibited a reduction in lower lip opening and closing velocities and utterance-to-utterance variability when performing the visuomotor tracking task. This pattern suggests that increasing cognitive demands, compounded by divided attention requirements, can affect processing and speech production.K

    Committee: Jason Whitfield Ph.D (Committee Chair); Alexander Goberman Ph.D (Committee Member); Adam Fullenkamp Ph.D (Committee Member) Subjects: Acoustics; Biomechanics; Biomedical Research; Biostatistics; Communication; Health; Health Care; Health Sciences; Language; Occupational Therapy; Physiology; Psychology; Scientific Imaging; Speech Therapy; Statistics
  • 18. Eck, Allison Which Chemotherapy Treatment Setting Best Predicts the 5-year Survival Rates in Women Diagnosed with Triple Negative Breast Cancer?

    Doctor of Healthcare Administration (D.H.A.), Franklin University, 2024, Health Programs

    Female breast cancer contributes to over two million cancer cases each year worldwide and remains a top contributor to mortality. Expedient treatment may mean the difference between positive and negative survivor outcomes but when facing an aggressive subtype with no targeted treatment, how do oncologists get quick and correct care to their patients? Triple negative breast cancer accounts for approximately 15% of all breast cancer diagnoses each year. These patients are faced with a highly aggressive cancer that lacks positivity for all three molecular markers that can guide treatment. These patients are often younger, have multiple comorbidities, and have socioeconomic disparities that may affect their access to care. Understanding that triple negative breast cancer is chemosensitive, polychemotherapy remains the backbone of TNBC treatment. This research will delve into predictors that can affect time to treatment and survivor outcomes based on chemotherapy setting administration (adjuvant or neoadjuvant). This quantitative study utilizes the National Cancer Database – Public Use Files to identify triple negative breast cancer patients residing within the mid-Atlantic (New York, New Jersey, Pennsylvania) between 2004 and 2014 who had both chemotherapy and surgery as their treatment protocol (n = 4,528). Using generalized linear models and Cox Proportional Hazard Regressions, the data found that women treated in a comprehensive community cancer program had an average time to treatment of 3.7 days from date of diagnosis. Patients who were privately insured encountered a margin decrease in treatment time, waiting 3.5 days. Additionally, the analysis indicated that there is no significant difference in survivor outcomes at 60 months between adjuvant or neo-adjuvant chemotherapy administration. Hospital administration and healthcare leaders must be capable of providing insight and support to clinicians and encourage multidisciplinary collaboration. This colla (open full item for complete abstract)

    Committee: Jeffrey Ferezan (Committee Chair); Cynthia Smoak (Committee Member); John Suozzi (Committee Member) Subjects: Biostatistics; Health Care; Health Care Management; Oncology
  • 19. Monabbati, Shayan AI-DRIVEN PIPELINES FOR IMPROVING CLINICAL UTILITY ACROSS CYTOPATHOLOGY & HISTOPATHOLOGY

    Doctor of Philosophy, Case Western Reserve University, 2024, EECS - System and Control Engineering

    This dissertation investigates the application of digital pathology for developing diagnostic and prognostic tools for 2 diseases: Biliary tract adenocarcinoma and Papillary Thyroid Carcinoma (PTC). We explore the realms of cytopathology, which studies exclusively the morphologies of epithelial cells, and histopathology, which includes the entire tissue region. Bile duct brush specimens are difficult to interpret as they often present inflammatory and reactive backgrounds due to the local effects of stricture, atypical reactive changes, or previously installed stents, and often have low to intermediate cellularity. As a result, diagnosis of biliary adenocarcinomas is challenging and often results in large interobserver variability and low sensitivity. In this dissertation, we first used computational image analysis to evaluate the role of nuclear morphological and texture features of epithelial cell clusters to predict the presence of biliary tract adenocarcinoma on digitized brush cytology specimens. We improved the sensitivity of diagnosis with a machine learning approach from 46% to 68% when atypical cases were included and treated as nonmalignant false negatives. The specificity of our model was 100% within the atypical category. PTC is the most prevalent form of thyroid cancer, with the classical form and the follicular variant representing the majority of cases. Despite generally favorable prognoses, approximately 10% of patients experience recurrence post- surgery and radioactive iodine therapy. Attempts to stratify risk of recurrence have relied on gene expression-based prognostic and predictive signatures with a focus on mutations of well-known driver genes, while hallmarks of tumor morphology have been ignored. In this dissertation, we introduce a new computational pathology approach to develop prognostic gene signatures for thyroid cancer that is informed by quantitative features of tumor and immune cell morphology. We show that integrating gene express (open full item for complete abstract)

    Committee: Kenneth Loparo (Committee Chair); Anant Madabhushi (Advisor); Satish Viswanath (Committee Member); Sylvia Asa (Committee Member); Aparna Harbhajanka (Committee Member) Subjects: Artificial Intelligence; Biomedical Engineering; Biomedical Research; Biostatistics; Computer Engineering; Medical Imaging; Oncology; Systems Design
  • 20. Talafha, Ahmad Functional Principal Component Analysis for Heterogeneous Survival Data

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2024, Statistics

    Longitudinal biomarkers offer insights into disease progression. Accordingly, during their follow-up appointments, patients' biomarker data are recorded to track these changes over time. To accurately anticipate disease progression, it's imperative to employ statistical models tailored to these longitudinal biomarker datasets, accounting for patient heterogeneity. Traditional models like the Cox proportional hazards model (Cox PH) may not fully capture the intricacies of heterogeneous, time-dependent data due to their inherent assumption of homogeneity. In response, we integrate Functional Principal Component Analysis (FPCA) and Supervised FPCA with the Cox PH mixture model to better handle these challenges. This integration aims to utilize FPCA for extracting meaningful features from longitudinal biomarker data, while Supervised FPCA is employed to improve the relevance of these features to patient outcomes and use these features as covariates in Cox PH mixture model to conduct dynamic predictions. To enhance model adaptability to heterogeneous patient subgroups, we extend the Cox PH framework by incorporating dynamic penalty functions, specifically the Smoothly Clipped Absolute Deviation (SCAD) and the Minimax Concave Penalty (MCP), into a mixture model setting. This approach helps to mitigate the assumption of homogeneity among patient groups. Additionally, we study a modified Expectation Maximization (EM) algorithm, tailored for our Cox PH mixture model, which facilitates the concurrent estimation of model parameters and determination of the appropriate number of mixture components. Our approach provides a structured method for analyzing longitudinal biomarker and survival data, enabling more nuanced predictions that can adapt as new biomarker information becomes available. Through simulation studies and real-world data application, we demonstrate the utility of our method, though noting its predictive performance compared to traditional methods warrants careful (open full item for complete abstract)

    Committee: John Chen (Committee Chair); Kei Nomaguchi (Other); Riddhi Ghosh (Committee Member); Umar Islambekov (Committee Member) Subjects: Biostatistics; Statistics