Examining the Consequences of Ignoring Diversity in Genomic Data
- Pre-Collegiate Global Health Review

- Feb 22, 2024
- 4 min read
By Akansha Sallakonda, Dublin High School, Dublin, California, USA

The Human Genome Project, a thirteen-year long initiative, aimed to build a comprehensive picture of the human genome. Revolutionizing translational medicine, it gave researchers the ability to study the human genome at a molecular level and to work towards developing cures even for diseases previously untreatable, or even undiagnosable.
However, the Human Genome Project sampled only a handful of volunteers, the majority of whom were of European descent (Mapes, 2019). The issue is blatantly reflected in recent genetics research, especially in GWAS (genome-wide association studies), which specifically study the correlation between genetic variations and biological characteristics to further understand possible causes for numerous diseases. In fact, over 78% of individuals studied in GWAS are European, even while Europeans are only 16% of the world population (Sirugo et al., 2019).

Figure 1: Distribution of GWAS. Left image categorizes by the number of studies conducted, while the right categorizes by the total number of individuals in such studies. Both confirm disparities in GWAS (Sirugo et al., 2019).
Findings regarding genetic variations which are based on European populations cannot be accurately applied to other multi-ethnic groups. The Population Architecture using Genomics and Epidemiology (PAGE) found in a GWAS of 26 clinical and behavioral phenotypes that there were “substantial benefits” in examining diverse populations to find critical variants, many of which may be rare or even absent in European populations (Wojcik et al., 2019). Because genomic variations in DNA across world-populations remain under-studied, it has become clear that both multi-racial sample populations are increasingly essential to understanding and developing treatments for disease.
GWAS have identified that for 25% of variants associated with diabetes in European populations, the strength of the correlation varies for one in five non-European populations (Popejoy, Fullerton, 2016). Essentially, the variants which indicate increased risk of diabetes in those of European ancestry populations are less accurate for others of non-European descent. Genetic variation among diverse populations extends to drug metabolism—and can have dangerous consequences. The drug tamoxifen, used in treating breast cancer, is metabolized by the gene CYP2D6. However, the upwards of 100 alleles existing for this gene occur at different frequencies in different populations, affecting the individual patient’s ability to safely use the drug (Popejoy, Fullerton, 2016). This trend of disparities is additionally ubiquitous in genetic testing; patients of African and Asian populations are especially likely to receive ambiguous test results—namely false positives—in comparison to their European counterparts (Petrovski, Goldstein, 2016).
Such ambiguous test results from multiple panel gene testing are known as variants of uncertain significance, or VUS. Individuals of racial minority groups are especially likely to receive a VUS from test results as early genetic sequencing was predominantly conducted in white people; in fact, over one-third of a group of racially diverse people received a VUS, compared to only one-fourth of white people who did (Caswell-Jin et al., 2018). This can have devastating consequences, leading to needless surgeries, medical procedures, or treatments. In a survey of over 7000 women who had undergone multiple panel gene testing, 15% of those who received a VUS for an ovarian cancer related gene had their ovaries and fallopian tubes removed. However, many experts say such surgeries are unwarranted because a VUS should not be used for medical decisions (Bennett, 2021). Only by diversifying and expanding genomic datasets can we hope to avoid unnecessary and irreversible medical procedures, or else we risk exacerbating existing medical disparities.
Part of the ineffectiveness of genetic testing in marginalized populations is the result of failing to translate findings on genetic variation to clinical medicine and drug development. Recent research has shown, for example, that African Americans with West African ancestry are more likely to develop diseases such as heart disease and diabetes as well as have different responses to treatments as a result of varied gene expression (Park et al., 2019). An epigenetics population-based study of families in Suihua, China, found that prenatal exposure to the widespread famine between 1959 and 1961 increased risk of hyperglycemia in two consecutive generations (J. Li et al., 2016). Similar patterns can be traced in people of South Asian descent, whose ancestors were repeatedly exposed to at least 24 devastating famines within the span of 50 years. Exposure to even one famine has multi-generational genetic consequences, increasing risk of numerous metabolic disorders such as heart disease and diabetes (Bakar, 2022). These are only a few examples of genetic variation in non-European populations. Considering the rapidly growing potential of genetics in translational medicine, how can one hope to reach marginalized communities when science has yet to understand various genetic predispositions to disease?
Research confirms that the solution lies in diversifying “genetic discovery datasets,” as opposed to simply increasing sample size, is the “single most efficient approach” to painting a more holistic picture of the genome (Graham et al., 2021). Yet one significant barrier remains: medical mistrust. From past exploitation of the African American community in the United States (such as in the Tuskegee syphilis studies) to the unethical practices of colonial science, many communities have understandably felt suspicion towards science. Because local communities participating in clinical trials are the ones who bear the consequences, whether positive or negative, they deserve to have more share in the clinical design. In a framework for community-led science, researchers emphasized the importance of co-learning, transparent and collaborative decision-making, and a true reciprocal relationship (Wand et al., 2023).

Figure 2: Community-led design, adapted from Wong-Gates n.d (Wand et al., 2023).
Genomics researchers should not only build trust, but demonstrate that trust in a few different ways: acknowledging the power imbalance between researcher and community, providing the ability for patients to easily withdraw and opt-out, and having linguistic as well as cultural competence (Atutornu et al., 2022).
References
Bakar, F. (2022, March 13). How History Still Weighs Heavy On South Asian Bodies Today. HuffPost UK. Retrieved April 9, 2023, from https://www.huffingtonpost.co.uk/entry/south-asian-health-colonial-history_uk_620e74fee4b055057aac0e9f
Bennett, C. (2021, February 7). Ambiguous genetic test results can be unsettling. Worse, they can lead to needless surgeries. The Washington Post. https://www.washingtonpost.com/health/genetic-tests-uncertain-results/2021/02/05/80a06d9a-65a2-11eb-8468-21bc48f07fe5_story.html
Caswell-Jin, J. L., Gupta, T., Hall, E., et al. (2018). Racial/ethnic differences in multiple-gene sequencing results for hereditary cancer risk. Genetics in medicine : official journal of the American College of Medical Genetics, 20(2), 234–239. https://doi.org/10.1038/gim.2017.96
Hood, L., & Rowen, L. (2013, September 13). The Human Genome Project: big science transforms biology and medicine - Genome Medicine. Genome Medicine. Retrieved April 9, 2023, from https://genomemedicine.biomedcentral.com/articles/10.1186/gm483
Jie Li, Simin Liu, Songtao Li, Rennan Feng, Lixin Na, Xia Chu, Xiaoyan Wu, Yucun Niu, Zongxiang Sun, Tianshu Han, Haoyuan Deng, Xing Meng, Huan Xu, Zhe Zhang, Qiannuo Qu, Qiao Zhang, Ying Li, Changhao Sun, Prenatal exposure to famine and the development of hyperglycemia and type 2 diabetes in adulthood across consecutive generations: a population-based cohort study of families in Suihua, China, The American Journal of Clinical Nutrition, Volume 105, Issue 1, January 2017, Pages 221–227, https://doi.org/10.3945/ajcn.116.138792
Mapes, D. (2019, June 19). Lack of diversity in genetic research is a problem. Fred Hutchinson Cancer Center. Retrieved April 9, 2023, from https://www.fredhutch.org/en/news/center-news/2019/06/lack-diversity-genetic-research-problem.html
Park, C.S., De, T., Xu, Y. et al. Hepatocyte gene expression and DNA methylation as ancestry-dependent mechanisms in African Americans. npj Genom. Med. 4, 29 (2019). https://doi.org/10.1038/s41525-019-0102-y
Petrovski, S., Goldstein, D.B. Unequal representation of genetic variation across ancestry groups creates healthcare inequality in the application of precision medicine. Genome Biol 17, 157 (2016). https://doi.org/10.1186/s13059-016-1016-y
Popejoy, A. B., & Fullerton, S. M. (2016, October 12). Genomics is failing on diversity - PMC. NCBI. Retrieved April 9, 2023, from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5089703/
Rice, D., & Galbraith, M. (2008, November 16). .,. ., - YouTube. Retrieved April 9, 2023, from https://www.thelancet.com/journals/ebiom/article/PIIS2352-3964(22)00063-9/fulltext#seccesectitle0009
Sirugo, G., Williams, S. M., & Tishkoff, S. A. (2019, March 21). Home. Cell Press. https://www.cell.com/cell/fulltext/S0092-8674(19)30231-4
Wand, H., Martschenko, D. O., Smitherman, A., Michelson, S., et al. (2023). Re-envisioning community genetics: community empowerment in preventive genomics. Journal of community genetics, 1–11. Advance online publication. https://doi.org/10.1007/s12687-023-00638-y
Wojcik, G.L., Graff, M., Nishimura, K.K. et al. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518. https://doi.org/10.1038/s41586-019-1310-4
Wong-Gates, R. (n.d.) Client Centered Therapy. Rennet Wong Gates. https://www.rennetwonggates.com/client-centered-therapy/




Comments