Date of Award

2026

Document Type

Thesis

Degree Name

Bachelor of Science

Department

Biochemistry & Molecular Biol.

First Advisor

Dr. Melinda A. Yang

Abstract

Human evolutionary genomics provides an avenue to understand how genetic variation has been impacted by the environmental pressures of the past, revealing how selective pressures impacted past populations and contributed to present-day biological diversity. In this thesis, I describe the construction of a Python-based computational pipeline designed to automate and integrate multiple analyses of positive selection across worldwide population datasets, including derived allele frequency calculation, haplotype-based tests of selection, RELATE-based genealogical inference, and CLUES2-based temporal modeling. By automating file preparation, format conversion, job submission, result aggregation, model comparison, and visualization, this pipeline streamlines the data interpretation process, and shifts focus from troubleshooting toward examining the timing and intensity of positive selection for a chosen allele. I then apply this pipeline to ADH1B*2, a variant in alcohol dehydrogenase previously implicated in positive selection, to investigate the geographic distribution and timing of selection acting upon this allele. Across worldwide populations, ADH1B*2 exhibited substantial geographic heterogeneity, with the highest derived allele frequencies and strongest evidence of selection observed ~5,000 years ago among East and Southeast Asian populations. Taken together, these findings support a model in which ADH1B*2 was initially present at lower frequencies prior to a relatively recent intensification of positive selection, suggesting that selection acting upon this allele may reflect later population-specific regional selective pressures and histories rather than a single selective pressure occurring uniformly across East Asia.

Share

COinS