Master of Science in Biomedical Sciences (MSBS), University of Toledo, 2019, Biomedical Sciences (Bioinformatics and Proteomics/Genomics)
Every human has about 100 novel mutations that are absent in the genomes of his/her parents. This intense influx of mutations degrades information that is stored in the DNA sequences and, at the same time, provides an opportunity for creation of new genetic messages. Currently, over one hundred million mutations have been characterized in the public databases. The dynamics of mutation have been investigated for decades in both experiments and sophisticated mathematical models, yet our understanding of genome evolution is still ambiguous. In this project, we computationally processed eighty million human mutations to get clear answers to basic questions about DNA evolution. Specifically, how is the non-randomness in nucleotide composition in vast genomic regions maintained? What biological forces preserve sequence non-randomness from being degraded by novel mutations? Our goal was to uncover peculiarities in dynamics of G+C nucleotide content and evaluate the equilibrium of GC-percentage in the human genome.
We found that novel mutations that convert G:C pairs into A:T pairs are 1.39 times more frequent than opposite mutations that change A:T → G:C. This effect is more striking if we take into account the fact that the total number of G:C pairs (42%) is significantly less than the number of A:T pairs (58%). Hence, calculating per nucleotide pair, the mutations of G:C → A:T is 1.93 times more frequent than A:T → G:C mutations. Such bias should create fewer and fewer G:C pairs in the genomes from generation to generation, until it reaches equilibrium at 34% of GC-composition. However, the GC-percentage of the human genome is stable at 42%. There are two possible biological processes that may be responsible for preserving GC-composition from degradation: i) natural selection or ii) biased gene conversion. However, estimated parameters for both processes are unable to explain the maintenance of CG-percentage. We re-evaluated the biased gene conversion paramete (open full item for complete abstract)
Committee: Alexei Fedorov (Committee Chair); Robert Blumenthal (Committee Member); Sadik Khuder (Committee Member)
Subjects: Bioinformatics; Biology