Webinar Follow-Up: Big Data: Practical Solutions to Emerging Challenges for IRBs

by Alexandra Ruth, research assistant at the International Food Policy Research Institute; member of PRIM&R’s Blog Squad

I was quite excited to find out about the Big Data: Practical Solutions to Emerging Challenges for Institutional Review Boards (IRBs) webinar offered by PRIM&R. We are in the process of revising the section of our IRB application that deals with data security at my institution, and much of the data we work with certainly fits at least one of the 12 definitions of big data noted in this article that was provided as a background resource by the presenters. Our application revision process has raised many questions: What is the exact scope of the IRB’s responsibility for ensuring researchers’ responsible use of big data? How do we anticipate potential risks, and how do we convey these risks to researchers who may not initially see the potential harms that could arise from lack of attentiveness to ethical use of data?

Both of the first two case studies in the webinar about data use in international research were extremely relevant to the work that my institution does. Case studies such as these are crucial for getting people to think about real-life complex data use scenarios, and I was able to stash these away as potential discussion topics for our own staff. I appreciated Betsy Draper’s point that the IRB can use its own discretion when deciding whether or not these sort of cases require IRB review. For example, I learned that the IRB can require expedited review of a project (even though it may qualify for exemption) so that the IRB can monitor data use in the project for its full life cycle. Betsy’s argument that “patterns can lead to identifiability” is exactly the sort of point that resonates with researchers where I work, and it is important to keep in mind as pools of data grow larger and larger.

In Sean Owen’s portion of the presentation, I particularly appreciated his real-life case study that involved creatively modifying a dataset and having a dialogue with researchers about how the data could be de-identified in such a way that the researchers requesting the dataset could still do the analyses they wanted to do. Researchers tend to favor having as much information as possible, but data volume does not necessarily equal data quality, so it is helpful to think about innovative ways to share datasets that contain the minimum amount of information necessary for analyses.

Both presenters emphasized the importance of forging relationships among different professionals within an institution, and the different professional backgrounds of the two presenters helped to drive this point home [editor’s note: Betsy is the assistant director of IRB administration at Harvard University’s Committee on the Use of Human Subjects and Sean is the director of the client cyber security center at Abt Associates Inc.]. Particularly helpful was the information on the “who’s who” of ensuring ethical big data use; IT professionals, legal counsel, and research administrators can all work with the IRB to ensure secure and ethical use of big data. Additionally, both presenters did an excellent of sorting out the alphabet soup that surrounds data use – DUAs, TOS, FERPA, HIPAA, etc. – and referring participants to “Seven Steps for IRB Review of Big Data” provided a succinct, actionable way for IRBs to pare down the review process into manageable steps.

My main takeaways:

  1. The IRB is not an island! It is crucial to engage other institutional networks for monitoring and reviewing big data use.
  2. The IRB can use its own discretion in what level of review is required for ambiguous cases involving big data. Requiring expedited review for a project that may seem exempt at first glance is one way that the IRB can ensure oversight of data use for the whole life cycle of the project.
  3. Collecting, refining, and discussing case studies is a key method for institutions to develop better resources for staff training and awareness about complex data use scenarios.
  4. As a general rule, share the minimum amount of information needed for analyses when exchanging data between institutions, and consider creative ways of de-identifying data that still will address the research question. This is one way to minimize risk.

PRIM&R thanks Alexandra for recapping the webinar for those unable to attend. If you’d like to hear more of the actual webinar, you may purchase the recording here.

Visit our website to learn more about becoming a member of PRIM&R’s Blog Squad. Members of the Blog Squad receive complimentary registration to the event they will be blogging about.