top of page
  • Writer's picturePre-Collegiate Global Health Review

Examining the Consequences of Ignoring Diversity in Genomic Data

By Akansha Sallakonda, Dublin High School, Dublin, California, USA

The Human Genome Project, a thirteen-year long initiative, aimed to build a comprehensive picture of the human genome. Revolutionizing translational medicine, it gave researchers the ability to study the human genome at a molecular level and to work towards developing cures even for diseases previously untreatable, or even undiagnosable.

However, the Human Genome Project sampled only a handful of volunteers, the majority of whom were of European descent (Mapes, 2019). The issue is blatantly reflected in recent genetics research, especially in GWAS (genome-wide association studies), which specifically study the correlation between genetic variations and biological characteristics to further understand possible causes for numerous diseases. In fact, over 78% of individuals studied in GWAS are European, even while Europeans are only 16% of the world population (Sirugo et al., 2019).

Figure 1: Distribution of GWAS. Left image categorizes by the number of studies conducted, while the right categorizes by the total number of individuals in such studies. Both confirm disparities in GWAS (Sirugo et al., 2019). 

Findings regarding genetic variations which are based on European populations cannot be accurately applied to other multi-ethnic groups. The Population Architecture using Genomics and Epidemiology (PAGE) found in a GWAS of 26 clinical and behavioral phenotypes that there were “substantial benefits” in examining diverse populations to find critical variants, many of which may be rare or even absent in European populations (Wojcik et al., 2019). Because genomic variations in DNA across world-populations remain under-studied, it has become clear that both multi-racial sample populations are increasingly essential to understanding and developing treatments for disease.  

GWAS have identified that for 25% of variants associated with diabetes in European populations, the strength of the correlation varies for one in five non-European populations (Popejoy, Fullerton, 2016). Essentially, the variants which indicate increased risk of diabetes in those of European ancestry populations are less accurate for others of non-European descent. Genetic variation among diverse populations extends to drug metabolism—and can have dangerous consequences. The drug tamoxifen, used in treating breast cancer, is metabolized by the gene CYP2D6. However, the upwards of 100 alleles existing for this gene occur at different frequencies in different populations, affecting the individual patient’s ability to safely use the drug (Popejoy, Fullerton, 2016). This trend of disparities is additionally ubiquitous in genetic testing; patients of African and Asian populations are especially likely to receive ambiguous test results—namely false positives—in comparison to their European counterparts (Petrovski, Goldstein, 2016).  


Such ambiguous test results from multiple panel gene testing are known as variants of uncertain significance, or VUS. Individuals of racial minority groups are especially likely to receive a VUS from test results as early genetic sequencing was predominantly conducted in white people; in fact, over one-third of a group of racially diverse people received a VUS, compared to only one-fourth of white people who did (Caswell-Jin et al., 2018). This can have devastating consequences, leading to needless surgeries, medical procedures, or treatments. In a survey of over 7000 women who had undergone multiple panel gene testing, 15% of those who received a VUS for an ovarian cancer related gene had their ovaries and fallopian tubes removed. However, many experts say such surgeries are unwarranted because a VUS should not be used for medical decisions (Bennett, 2021). Only by diversifying and expanding genomic datasets can we hope to avoid unnecessary and irreversible medical procedures, or else we risk exacerbating existing medical disparities.  


Part of the ineffectiveness of genetic testing in marginalized populations is the result of failing to translate findings on genetic variation to clinical medicine and drug development. Recent research has shown, for example, that African Americans with West African ancestry are more likely to develop diseases such as heart disease and diabetes as well as have different responses to treatments as a result of varied gene expression (Park et al., 2019). An epigenetics population-based study of families in Suihua, China, found that prenatal exposure to the widespread famine between 1959 and 1961 increased risk of hyperglycemia in two consecutive generations (J. Li et al., 2016). Similar patterns can be traced in people of South Asian descent, whose ancestors were repeatedly exposed to at least 24 devastating famines within the span of 50 years. Exposure to even one famine has multi-generational genetic consequences, increasing risk of numerous metabolic disorders such as heart disease and diabetes (Bakar, 2022). These are only a few examples of genetic variation in non-European populations. Considering the rapidly growing potential of genetics in translational medicine, how can one hope to reach marginalized communities when science has yet to understand various genetic predispositions to disease? 


Research confirms that the solution lies in diversifying “genetic discovery datasets,” as opposed to simply increasing sample size, is the “single most efficient approach” to painting a more holistic picture of the genome (Graham et al., 2021). Yet one significant barrier remains: medical mistrust. From past exploitation of the African American community in the United States (such as in the Tuskegee syphilis studies) to the unethical practices of colonial science, many communities have understandably felt suspicion towards science. Because local communities participating in clinical trials are the ones who bear the consequences, whether positive or negative, they deserve to have more share in the clinical design. In a framework for community-led science, researchers emphasized the importance of co-learning, transparent and collaborative decision-making, and a true reciprocal relationship (Wand et al., 2023).

Figure 2: Community-led design, adapted from Wong-Gates n.d (Wand et al., 2023).  

Genomics researchers should not only build trust, but demonstrate that trust in a few different ways: acknowledging the power imbalance between researcher and community, providing the ability for patients to easily withdraw and opt-out, and having linguistic as well as cultural competence (Atutornu et al., 2022).



bottom of page