Infographic for ClinGen Ancestry and Diversity Working Group

New ELSIconversations with the Ancestry and Diversity Working Group of ClinGen


Alice B. Popejoy

The Ancestry and Diversity Working Group (ADWG) of the Clinical Genome Resource (ClinGen) was founded in 2017, with the goal of providing guidance about how population descriptors such as race, ethnicity, and ancestry (REA) should (or should not) be used in clinical genetics.

In collaboration with ELSIhub, ClinGen is launching its first ELSIconversations series through this platform to broaden the scope of community engagement across disciplines and ensure that the process of developing guidelines in partnership with genomics professional societies is informed by robust, transdisciplinary deliberations.

With the National Academy of Sciences, Engineering and Medicine (NASEM) committee on the use of population descriptors in genomics research underway, we have a timely opportunity to provide insights from what we have learned so far and build on this momentum toward further development. Our goal is to support the NASEM effort, which will not include clinical genetics in its guidance, as it will be imperative for recommendations about basic research to be consistent with those we are working on in the domain of clinical genetics and precision medicine.

ClinGen’s ADWG set out to create a group dynamic that would foster cross-disciplinary understanding and collaboration among clinical genetics professionals, statistical and human population genomics researchers, bioethicists, social scientists, and historians. This was a strategic decision based on the inherently challenging and interdisciplinary nature of the task at hand; the breadth and depth of expertise among our members and contributors have been key.

Our work began in 2017 when there was widespread awareness of the lack of diversity in basic research among genome scientists, but little understanding of how genome-wide association studies (GWAS) on mostly European genomic backgrounds translates to disparities in clinical genetics. To enable the development of rigorous, evidence-based guidance on the use of REA in clinical genetics, we first needed to establish a baseline understanding of definitions, data, policies, and practice. We started by analyzing REA frameworks on clinical forms and genomic databases, quantified information disparities in the genomic knowledgebase, and then interviewed or surveyed clinical genetics professionals and researchers to understand where we are as a field.

The AMP/ACMG Guidelines for clinical variant interpretation include criteria on population-level observations of variants, which maintains that low allele frequencies (or unobserved) in population databases supports interpretations of that variant as pathogenic. However, the lack of diversity in these databases means that some variants may simply be missing because the data are incomplete—not because the variant is truly rare or absent across all populations. We found most variants in genomic databases are from European populations, and characterize this as “information disparity” which may bias clinical variant interpretation.

The knowledge and data we have available to inform clinical variant interpretation and other functions of clinical genetics are crucial to determining outcomes. Uncertainty in interpretation of clinical genetic tests falls disproportionately on people whose genomic background is not well-represented in population databases. This means that global communities who have been excluded from genomics research also experience less benefit and more harm than European and European-descended groups receiving clinical genetic testing.

In addition to information disparities in clinical genetics, a major source of uncertainty that may lead to differences in utility across patient populations is inconsistency in the way patient identities and population descriptors are represented and understood. We analyzed test requisition forms from clinical genetics laboratories and found that no two forms used the same language or categories to describe populations.

These categories are also different from the groups represented in population databases such as gnomAD, and it is unclear how people conducting clinical variant interpretation deal with this discrepancy between information about patients and the data they use to make assertions about the pathogenicity of variants. By surveying clinical genetics professionals about the meaning of terminology used to describe human populations, we learned that there is just as much confusion among practitioners as there is heterogeneity in the forms that they use to order genetic tests.

Nevertheless, most clinical genetics professionals we surveyed indicated that ‘race’, ‘ethnicity’, and ‘ancestry’ are all at least somewhat important for clinical variant interpretation – despite not being certain how to distinguish among these terms. Ancestry appears to be the most important for this purpose, according to our survey results, though this information is the least likely to be available in a clinical setting. Instead, 94% of survey respondents said that the REA data they have is based on a patient or provider indicating race or ethnicity on a clinical form.

The use case in this article – clinical variant interpretation – is important. There are, however, many other scenarios in the domain of clinical genetics practice in which patient identification based on REA, the genomic background of patients, and population descriptors in databases and on clinical forms or electronic health records (EHRs) play a role.

The ClinGen ADWG founding Co-Chair, Dr. Alice B. Popejoy, has been crowd-sourcing clinical genetics scenarios in which REA may be involved (i.e., ‘use cases’), through invited talks with professional societies and clinical research grand rounds she has re-purposed as interactive seminars using a combination of online survey tools. Carlee Dawson, a genetic counseling student at Bay Path University, is conducting her M.S. thesis with Dr. Popejoy (as an external advisor) using data generated through a workshop they designed for the 2021 Wisconsin Genetics Exchange. This work combined with the collective efforts of the ClinGen ADWG to generate a comprehensive evidence base for the development of recommendations on the use of REA in clinical genetics will serve as the foundation of a new series of ELSIconversations.

The initial ADWG ELSIconversations held on April 29, May 6, and May 20, 2022, comprise a series of three events on population descriptors (e.g., race, ethnicity, and/or ancestry) on clinical genetics laboratory requisition forms. The first session will provide a foundation of knowledge about what information is represented on these forms, and how it is used in clinical genetics practice. The second will dive more deeply into how the information is used, for what purpose – to reveal the critical information that is most important for clinical genetics practice. Finally, the third session will focus on conditions and considerations of making widespread changes to the way populations are represented on forms.  We invite ELSI scholars, along with genetic and genomic researchers and those in related disciplines, to join us for these ELSIconversations and add your voice to interactive discussions of population descriptors in clinical care and genomics.

Register below for the ADWG ELSIconversations:

Please sign up here to receive notifications about the upcoming ELSIconversations series, and email ELSIhub with any questions or suggestions.