Research Data Management and Sharing

Overview

Many U.S. federal funding agencies now require researchers to develop Data Management and Sharing plans, store data in trusted repositories, and make data available in accordance with agency policies. The specifics vary by agency: scope of required data, timing of sharing, acceptable repositories, and exceptions all differ.

For detailed, up-to-date information on each agency’s requirements—including tables of which agencies require data sharing, examples of acceptable repositories, and direct links to agency policies—please consult the SPARC Data Sharing by U.S. Federal Agency guide.

NIH Data Management and Sharing Plans

Overview

All National Institutes of Health (NIH) research grant proposals, both new and competing renewals, need to include a Data Management and Sharing plan, have costs accounted for in the budget, and have progress shared with NIH in annual reports.

Policy

Who: The policy applies only to research grants, not training grants, fellowships, infrastructure grants, instrument grants, nor non-competitive renewals.

What: The policy dictates how data generated using support from these grants must be managed and shared. Scientific Data is defined as “recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” 

Where: When selecting data repositories, prioritize using 1) established discipline or data-type specific repositories to make it easy for people in your field to find and 2) repositories with desirable characteristics.

When: Data must be shared at time of first publication or at the end of the project period, whichever comes first. Unpublished data (that meet the definition above) must be deposited and reported at end of project period even if they will end up in a published paper. Data needs to be stored and made available for the full duration of the grant (including possible future renewals) plus 3 years.

How: All new and competing grant proposals must include a plan based on the NIH DMS form template.

The plan must cover six elements:

  1. Data Type
  2. Related tools, software, and/or code
  3. Standards
  4. Data Preservation, access, and associated timelines
  5. Access, distribution, and reuse considerations
  6. Oversight of Data Management & Sharing

PIs are expected to maximize sharing, which typically mean depositing data and associated metadata into publicly accessible repositories, which in all cases must be associated with a “persistent unique identifier,” usually a web-based DOI code. The goal is that data should be deposited in a way that complies with the “F.A.I.R.” principles (findable, accessible, interoperable, and reusable).

Best Practices

Data

  • Include
    • Relevant Scientific Data. Researcher decides what constitutes data and how to maximize sharing data and justification for what is and isn’t shared.
    • Data that led to null findings
    • Data sets of all sizes
    • Data generated with SBIR support (a 20-year delay is allowable)
    • Data for which there is no known repository
    • Qualitative Data, unless there are justifiable limitations to sharing (for example, field reports and ethnographic writings that contextualize and interpret rich participant-observation data)
    • Data that requires a Data Use Agreement for sharing (in other words, data still has to be shared, but with appropriate restrictions on public access)
  • Exclude
    • Non-research grants: training, fellowship, conferences, and infrastructure
    • Data not necessary for or of sufficient quality to validate and replicate research findings
    • Laboratory notebooks
    • Preliminary analyses
    • Completed case report forms
    • Drafts of scientific papers
    • Plans for future research
    • Peer reviews
    • Communications with colleagues, or
    • Physical objects, (e.g., laboratory specimens)
  • Some justifiable ethical, legal, and technical reasons for limiting and/or delaying sharing
    • Data are already public
    • Informed consent will not permit or limits scope of sharing or use
    • Privacy or safety of research participants would be compromised and available protections are insufficient (see the Supplemental Information on Protecting Privacy When Sharing Human Research Participant Data)
    • Explicit federal, state, local, or Tribal law, regulation, or policy prohibits disclosure
    • Restrictions imposed by existing or anticipated agreements with other parties
    • Datasets cannot practically be digitized with reasonable efforts

Budget

  • Allowable Costs
  • If the repository charges a fee for storing your data, it will typically have a one-time data publishing cost that can be paid during the NIH project period and allow for sharing beyond the project period.
  • Jessica Logan, Vanderbilt Associate Professor of Special Education has published on data sharing and recommends budgeting 5% to 10% of a research grant for preparing the collected data to be shared.
  • The Realities of Academic Data Sharing (RADS) Initiative Report on the Expenses of Making Data Publicly Accessible finds
    • the average percent of overall grant award that was used by researchers for Data Management and Sharing is 6%
    • the average cost directly incurred by researchers per funded research project for Data Management and Sharing is $29,80. By funding agency it is
      • $36,000—US National Institutes of Health
      • $19,000—US National Science Foundation

Sample Language

  • VUMC provides a guide for sharing data that are generated from from BioVU, including sample language for each section of the Data Management and Sharing Plan Template. If you have any questions, please email biovu@vumc.org.
  • VUMC also includes guidance in StarBRITE. Under the Data Management menu, select “NIH Data Management and Sharing (DMS) Policy”. The right-side menu has a NIH DMS Policy-FAQ, which will have the most up-to-date recommendations regarding sharing Synthetic Derivative data. 
  • Element 6: [Name of Grant PI or senior member of lab (give title)] will be responsible for verifying management, storage, retention, and dissemination of project data.  [NAME] will review data management and sharing activity annually and compare it to this plan. If discrepancies are noted, [NAME] will adjust study procedures or submit a revised Data Management and Sharing Plan to NIH. 
    • If you are using human subjects data, then some oversight is provided by the Human Research Protections Program. The Human Research Protections Program and IRB website will walk you through this process. You may include an additional statement such as "Data management is a part of the IRB approved protocol. Under the Human Research Protections Program, the study is subject to post-approval monitoring and deviations from the plan would be reportable to the IRB."
    • Note that in cases where there are multiple investigators included in the project (and possibly a subcontract), this statement will have to be altered and extended to explain who will be responsible for managing the data generated by each participating lab and who will be responsible for checking on compliance.
  • NIH sample plans

Additional Resources