An NIH Data Management and Sharing (DMS) Policy Primer for the PRIM&R Community

By Elena Ghanaim and Jonathan Lawson

The NIH Data Management and Sharing (DMS) Policy is a watershed moment for the sharing of data generated with federal funds. The DMS Policy expands expectations for NIH-funded researchers to establish plans for DMS and to comply with those plans over the course of an award in a manner that maximizes the appropriate sharing of scientific data.

Do IRBs play a role in data sharing under the NIH DMS Policy?
Absolutely! IRBs play a significant and leading role in supporting the aims of the DMS Policy by providing guidance to make sure that consent forms speak clearly to permissions for data sharing. The IRB-approved consent form is foundational to compliance with the DMS Policy and maximizing appropriate sharing of valuable scientific data, as investigators’ plans for sharing data must be consistent the provisions of the approved consent language.

NIH has provided Supplemental Guidance on Protecting Privacy When Sharing Human Research Participant Data, which is a helpful resource for IRBs considering how best to protect participants while maximizing data sharing. The best practices for research investigators include reminders to a) apply appropriate de-identification, b) establish scientific data sharing and use agreements, and c) understand and communicate any legal protections against disclosure and misuse.

Data Management and Sharing Plans
Given that Data Management and Sharing Plans (DMSPs) are drafted pre-award and become a Terms and Condition of an NIH award, investigators should work with their IRBs to make sure that consent forms and DMSPs are aligned so the institution avoids any negative repercussions for non-compliance to the policy, particularly avoiding consents that are more restrictive on data sharing than DMSPs. Investigators may wish to consult with their IRBs in the drafting of DMSPs, to assure that plans will be consistent with the eventual research protocol and associated informed consent materials. If specific communities are being recruited (e.g., underserved communities, certain races or ethnicities, Tribal communities), the researcher might consider consulting the community on their views or preferences towards data sharing. In advance of this new Policy, the NIH conducted a formal Tribal Consultation which resulted in Supplemental Guidance on Responsible Management and Sharing of American Indian/Alaska Native Participant Data.

Informed Consent
The role that IRBs play in reviewing and approving consent forms is an opportunity to ensure that the manner in which data can and will (and cannot and will not) be shared are described in clear terms, especially in light of the DMS policy. Researchers are expected to ensure that participants understand whether and how their data will be shared beyond the context of the individual study in which they are participating, and what this may mean for their decision to participate. While the new DMS Policy does not specify any consent requirements beyond those already set by U.S. federal regulations, it does “[encourage] researchers to plan for how data management and sharing will be addressed in the informed consent process, including communicating with prospective participants how their scientific data are expected to be used and shared.”

That said, the consent expectation of the existing NIH Genomic Data Sharing (GDS) Policy has not changed. It is important to remember that studies initiated after the effective date of the GDS Policy are expected, “to obtain participants’ consent for their genomic and phenotypic data to be used for future research purposes and to be shared broadly. The consent should include an explanation about whether participants’ individual-level data will be shared through unrestricted- or controlled-access repositories.”

As a starting point, there are several resources for crafting informed consent for data sharing. For instance, the NIH has provided points to consider and sample language for future use and/or sharing of data and biospecimens, and guidance on consent for data subject to the NIH GDS Policy. In addition, the National Human Genome Research Institute (NHGRI) maintains an informed consent resource for genomics research, which also includes sample informed consent language, including sample language for data and sample sharing through data repositories and biobanks.

The Global Alliance for Genomics and Health (GA4GH) is a not-for-profit organization that builds technical standards and policy frameworks centered around responsible sharing of genomic and health-related data. They, too, have published a consent policy outlining best practices for genomic data sharing, and a consent toolkit for genomics research containing sample consent clauses.

Central to this discussion, GA4GH has developed machine-readable consent guidance that maps to a technical standard called the Data Use Ontology (DUO) which provides a structured vocabulary of terms for data use restrictions and is based heavily on the NIH Standard Data Use Limitations. Although generated with the genomics research context in mind, these standard use restrictions are not data-type specific. Therefore, the consent guidance could be useful for all kinds of studies (not solely genomics studies), particularly those that generate data that will be shared via controlled-access. 

Certifying Data Submission
Beyond providing guidance on informed consent for data sharing, IRBs (and/or equivalent entities) also play a role in assuring that submission of large-scale genomic data to NIH-designated data repositories is consistent with the protocol and informed consent of research participants. IRBs may expect this role to expand to other data types that are shared under similar controlled-access models as genomic data.

There is an evolving landscape of repositories available to researchers. For example, the NHGRI AnVIL is a cloud-based NIH-designated repository that can be used to meet the DMS and GDS Policy requirements, and offers helpful documentation on how to use AnVIL components like DUOS and the Terra Data Repository to do so. For assistance in getting your organization started using the AnVIL for DMS, you can either reach out to the AnVIL support forum ( or stop by our booth at the PRIM&R annual meeting in December!

Ensuring Ethical Provenance
An emerging initiative in the GA4GH Regulatory and Ethics Workstream is ethical provenance, a term that encapsulates the idea that data governance must be rooted in the originating consent and other local requirements and consistently respected across users, locations, etc. With these recent developments driving greater data sharing, and IRBs’ critical role in laying the foundation for ethical provenance throughout the research data lifecycle, it is more important than ever that IRBs assure consent forms speak clearly when it comes to data sharing permissions.


Elena Ghanaim serves as Policy Advisor for Data Science and Sharing with the National Human Genome Research Institute, NIH; Jonathan Lawson is the Principal Software Product Manager at Broad Institute of MIT and Harvard