Would you post your genome online? A debate about privacy risks in genomic research

by Jackie Tekiela, MS, CIP, Institutional Review Board (IRB) Administrator at Wheaton Franciscan Healthcare

I have to admit, I love a great debate, and the session titled A Great Debate: Be it Resolved That Large-Scale Genomic Research Poses Special Privacy Risks to Research Subjects at the 2012 Advancing Ethical Research Conference was no exception. The session explored the ethical issues and privacy concerns related to genetic information identified through research, such as large-scale genome sequence data. A considerable portion of the debate focused on whether or not genetic data was actually identifiable.

Jeffrey R. Botkin, MD, MPH, took the position that genetic information itself is not identifiable and genetic data does not pose greater risks to privacy than other research data. He suggested that current regulations are adequate to protect the privacy of research participants involved in genetic research.

The question, Dr. Botkin argued, is not whether or not identification was feasible. Current regulation and guidance requires that the risk of individual identification be low, but not zero. While DNA sequences could be used to identify individuals, a sequence alone is unlikely to predict health information. The potential risks comes with the ability to link a sequence to reference databases, which are not readily available. Additionally, re-identification of a sequence is not trivial and requires significant expertise and motivation.

In closing, Dr. Botkin noted that, while genomic data doesn’t pose special privacy risks, there is a need to balance the use of genomic data—like any other data—with human subject protections. He also acknowledged that it is necessary to evaluate safeguards as technology progresses. In addition, he stressed the importance of not over emphasizing risks to possible subjects and working to de-stigmatize genomics research.

For the opposing argument, Latanya Sweeney, PhD, proposed that there are unique risks inherent in genomic research. Dr. Sweeney argued that genetic material cannot be de-identified, can lead to direct harm to research participants, and that additional steps should be considered to safeguard privacy of participants in genomic research. Before electronic records were commonplace, it was a believed that demographic data could not be identified. Dr. Sweeney has shown that 87% of the population is identifiable by zip code, birth date, and gender alone.

The threats of re-identification of genomic data are real. Dr. Sweeney provided the example of a 1996 study in which one third of Fortune 500 companies reported using medical information when making hiring or firing decisions. Life insurance discrimination (which is not prohibited by the Gentical Information Nondiscrimination Act, or GINA) could also occur if genomic data is used inappropriately. Dr. Sweeney also cited possible concerns as genomic research continues to progress, such as courts requiring researchers to use genomic databases as reference databases for identifying individuals. 

Dr. Sweeney also gave a number of examples of how individuals are willing to share data and how “big data” can be used to identify individuals. She noted that people’s expectations about privacy are changing–-a shift that will force us to consider what informed consent truly means.

Both debaters presented some thought-provoking points to consider as the field of genomic research moves forward. Many related issues abound in the field of internet research, as discussed by fellow blogger Andrea Johnson in Was that anonymous internet survey really anonymous?