Research Data Management and Sharing

Data Management Planning

Creating a Data Management Plan is the most important step within the research data lifecycle because it sets the rules the will ensure that individual researchers, colleagues, and external researchers can understand and reproduce all the work to be done effectively. Furthermore, many funding agencies now require a Data Management Plan (DMP) that describes how data with be organized and shared.

The DMPtool  is a web-based platform that is available to Vanderbilt faculty, students, and staff that provides customizable templates aligned with specific funder guidelines, offers expert guidance on best practices, and allows collaborative plan development. The DMPtool  simplifies the complex process of data management planning by walking researchers through the sections required by funding agencies step-by-step.

Data Hygiene

Data hygiene means that data management should be part of every researcher and research group's daily workflow. Setting aside a few minutes to properly name and organize files and describe them can save days or weeks of work in the future. For an individual the basic practices described in this guide should be sufficient to greatly improve the Findability, Accessibilty, Interoperability, and Reproducibility of their data. For a research group, these practices should by tuned and to fit within existing practices that are coordinated and track by a member or member(s) of the group for consistency. We recommend 4 basic data organization and 2 data management practices for this purpose (Borycz, 2021).

Data Organization

  1. Project based folder hierarchy - Each project should have its own folder. Files should NOT be organized based on file type. This project folder should contain subfolders that delineate important aspects of the project and make the important files easy to find (e.g., code, images, figures, data, notes).  
  2. Tag based file naming - A tag is a short series of characters that are used to indicate project, purpose, or version associated with a file or folder. Project files should be named so that is clear where they belong, what they are for, and which one is the most recent. These tags can be separated in one of two common ways:
    1. Snake case - All lower case letters with tags separated by underscores (e.g., project_purpose_20251028.csv)
    2. Camel case - Capital letters separate tags (e.g., projectPurposeVersion006.jpg)
  3. README files - Each project should contain a README file that provides context about the information contained in the project folder. It should be possible to navigate a project folder and understand its purpose by using the README file alone. This should include,
    1. Project description
    2. Researcher names and contact information
    3. Funding information
    4. Important dates (e.g., project initiation, last edited)
    5. Folder and file tag names and descriptions
    6. Variable/column names and descriptions
    7. How to compile/run code
    8. Publication information
  4. Tidy data - Tabular data should follow the 3 principles of tidy data below. These principles ensure that data is easy to understand and manipulate by both people and computers. This means that the data should NOT contain plots, formulas, file descriptors, or metadata.
    1. Each variable forms a column 
    2. Each observation forms a row
    3. Each cell contains a single value

Data Management

  1. Data management roles - Members of the group should be assigned roles to ensure that data management practices continue consistently in the long-term. These roles should include,
    1. Data/Access Manager - Leader of the data management process that assigns roles to others and ensures that data is clean, organized, and shared in a timely manner.
    2. Project Manager - Lead on an individual project that ensures that the project folder hierarchy, file naming conventions, and README file are clean and up-to-date. The Project Managers should meet with the Data/Access Manager regularly to ensure compliance with data management practices.
    3. Project Members - Contributors to projects that report to the Project Manager. Regular meetings should take place between the Project Members and Managers and access to all files should be shared between them. Issues should be discussed and addressed at these meetings.
    4. File Reviewer - External member of group that checks the files on a particular project regularly and reports issues to the project manager. The File Reviewer need not have expertise on the project. The file hierarchy, file names, and README should be checked regularly.
  2. Central file storage location - Clean project folders should be shared in a central location (e.g. Google Drive, OneDrive, central server, ACCRE) so that the data are backed up and available for the Data/Access Manager to monitor.