Best Practices for the Stewardship of Research Data

Across all stages of the research life cycle and all fields of study, researchers should consider the potential long-term impacts on the eventual storage and preservation of research data. Below are some resources by stage of the research life cycle that can serve as an entry point into data stewardship practices that will help researchers save time, address funder requirements, and ultimately maximize the impact of their research.

Setting up a study with data stewardship in mind–such as clear protocols for the collection and storage of data generated–will have tremendous downstream benefits. In addition, most agencies now require data management plans (DMPs) or other information about data management and stewardship as part of the proposal submission process.

Below are a selection of resources that can help researchers get started as they think through effective data practices while developing their study.

Study or Proposal Element Resources
Common Data Elements (CDEs) CDEs are structured human and machine-readable definitions of data elements for use in research and for other purposes. NIH has a Common Data Elements Repository to help researchers identify standardized terms or concepts used across studies ranging from surveys to disease nomenclature.
Metadata
Protocols
  • Curating and sharing individual protocols ensures consistency in research data practices within individual research groups and also makes it easier to share to the wider research community.
  •  U-M has an institutional subscription to an electronic lab notebook provider that allows researchers to enjoy the benefits, efficiencies, and long-term cost savings of centralized, paperless protocols and workflows.
  •  Protocols.io is another such platform for researchers to develop and share experimental protocols.
Data Management Plans

Most agencies now require data management plans (DMPs) or other information about data management and stewardship as part of the proposal submission process.

Proposal Budgeting Along with new requirements for data management, funding agencies are increasingly allowing data sharing costs to be included as direct costs in proposal budgets. ORSP provides high-level budget and cost guidance for proposals as it relates to direct costs associated with a project. Absent any prohibition from the funding agency, and in accordance with the applicable terms and conditions of the underlying grant, costs associated with data curation, data formatting, data de-identification, preparation of metadata, and repository data deposition fees may be planned for and included in the proposal as direct costs.

 

Discipline-Specific Guidance Resources
Clinical Research

U-M researchers can get assistance with the design, conduct, and analysis of clinical trials, including data management and software development through the Statistical Analysis of Biomedical and Educational Research Group (SABER).

Additional data collection guidance and an online course on the fundamentals of data management related to clinical research is available from the Michigan Institute for Clinical & Health Research (MICHR).

Qualitative Research Qualitative research is non-numerical data and often requires contextual information that pose additional data management challenges. The Data Curation Network provides a primer on data types in qualitative research to help researchers navigate data needs in these fields.
Computational Research Computational research increasingly must grapple with making code and software available, in addition to research data. Several sources provide guides for researchers on how to navigate these challenges, including NIH, Software Carpentry, and the Software Sustainability Institute. However, please and if necessary, consult with U-M Innovation Partnerships for questions about licensing options, best practices, and guidance about IP when releasing code open source.
Humanities Data needs are increasing across the humanities with the rise of digitization. The Digital Humanities Curation Guide provides a compilation of resources to help digital humanities scholars with data curation challenges. 
Diversity Scholarship An open data toolkit for diversity scholars to guide best practices in collecting, managing, utilizing, sharing and curating research data for the public good is available from the U-M Library.

Researchers have a number of factors to consider when managing research data–especially when dealing with potentially sensitive information or certain types of regulated data. U-M has a number of resources available to help researchers navigate these challenges depending on which types of data is being generated.

Overall Safety/Security Guidance Resources
International Collaborations and Export Controls Some research data may have restrictions on if/how they can be shared with foreign countries, persons, or entities. U-M Export Controls can help researchers ensure compliance with all appropriate regulations and create technology control plans (TCPs), if necessary.
Research Data Security Several types of research data require specific protections based on various university and legal requirements. U-M’s Research Information Security Oversight (RISO) Program works with PIs to determine which, if any, additional controls are required.
Safe Computing To protect yourself and your research data from phishing attacks or other electronic vulnerabilities, U-M provides high-level safe computing resources, including a sensitive data guide.

 

Research Related to Human Subjects Resources
General Guidance To help researchers maintain human subject data securely with the appropriate level of anonymity, confidentiality, or de-identification, refer to human subject data security guidance (including a checklist).   
Compliance Reviews Researchers can receive objective analysis and evaluation of research compliance, including data security and confidentiality for human subjects studies, from the Office of Research Compliance Review
Data Transfer Agreements (Michigan Medicine) When working with protected health data from Michigan Medicine, data transfer agreements associated with individual-level patient/participant data or biospecimens are reviewed by the Medical School Data Release Committee.
Diversity, Equity and Inclusion An introduction to the intersections between DEI and research data use is available in a 2020 report on Principles for Advancing Equitable Data Practice.
Student Data For educational research sponsored by the U.S. Department of Education, U-M Research Ethics and Compliance provide additional guidance around the Family Educational Rights and Privacy Act (FERPA) and other regulations.

Short-Term Research Data Management and Storage

Data Sharing and Long-Term Preservation

Short-Term Research Data Management and Storage

Responsibly and strategically managing research data streams during a study can go a long way towards improving the long-term impact and replicability of your research. There are many resources available to U-M researchers to assist in various aspects of research data management and/or analysis across a number of disciplines or approaches. Some university-wide examples are provided below.

 

Need Resources
Consulting Services
Data Storage Services (General)
Data Storage Services (Large Needs) For large amounts of data and/or large files, ITS Advanced Research Computing (ARC) offers a number of active research data storage services (e.g., OSiRIS, Locker, and Turbo). 
High Performance computing For researchers requiring high performance computing, ARC provides a number of computational and data storage resources, including the U-M Research Computing Package. Many schools and colleges also offer services in partnership with ITS including the College of Engineering, Medical School, and LSA.
Research Cores For data management needs specific to your discipline, there are a number of other services available. Many of the ~100 research cores across U-M, for example, offer data services related to the equipment and/or analyses they provide. 

 

Data Sharing and Long-Term Preservation

Research data needs change as researchers transition from actively managing a project and/or analyzing data to completing it and/or publishing it. Best practices include archiving or preservation to ensure public access, documentation of metadata to improve discoverability, and increasingly, annotation and deposition of code to ensure reproducibility. The following are examples of resources available to help researchers ensure their data is accessible over the long term.

 

Need Resources
General Guidance General guidance for sharing and preserving data, including how to select a repository, are available as a research guide from the U-M Library. Subject specific guides are also available for health sciences, engineering, and qualitative sciences.  
Long-Term Data Storage For larger data sets, Advanced Research Computing (ARC)’s Data Den Research Archive can be combined with other services (e.g., Globus, for which U-M has an institutional subscription) to enable long-term archiving of data that isn’t actively being accessed. 
Repositories (Digital Research Data)
  • Hundreds of data repositories are available to researchers depending on your field of study and needs. Re3data and the Open Access Directory maintain lists of repositories by country or field of research. NIH also maintains a list of NIH-supported domain-specific repositories.
  • U-M hosts several research data repositories for use by U-M researchers (and in many cases the broader research community) including:
Repositories (Physical Specimens)
  • In many disciplines, preserving research data can also include the permanent archiving of physical specimens. U-M has a number of world-class facilities and museums that assist researchers with access or depositing specimens into collections.
  • The Research Museums Center has staff that can assist with preservation of samples across anthropological archaeology, botany (and associated disciplines, paleontology, and zoology.)
  • The U-M Central Biorepository provides storage options for biospecimens and data associated with or derived from them.
Software and Code Sharing  To make computational code and/or software used to generate or analyze research data publicly available, code should be placed in a known and publicly understood repository such as GitHub, SourceForge, BitBucket or similar. These repositories should be actively maintained with updates, basic use instructions, appropriate licensing terms and an associated copyright notice. U-M Innovation Partnerships should be consulted with respect to best practices, options, approaches and guidance when releasing code open source.

 

Type of Protection Resources
Copyrights For information about copyright basics and Creative Commons licenses, researchers can review copyright guides or contact the U-M Library’s Copyright Services team directly.
Data Use Agreements (DUAs)
Publication Repositories
  • Researchers can deposit publications (before or after final publication) into publicly available repositories to satisfy funder public access requirements for publications, or to simply make scholarly work more widely accessible.
  • U-M provides an institutional repository called Deep Blue Documents to deposit articles, chapters, dissertations, conference presentations, media, and other work produced by the U-M community. 
  • Researchers may also choose one of many disciplinary repositories available at the Open Access Directory
Intellectual Property Intellectual property, technology licensing, and material transfer agreements, and often data use agreements with corporate sponsors, are handled by U-M’s Innovation Partnerships.
Publisher Data Policies

Publisher requirements for public access of research data have been evolving rapidly over the past several years. For example, many journals have chosen to adopt some or all of the Transparency and Openness Promotion (TOP) Guidelines, which require modular data citation and availability standards. 

Below are links to some large publisher policies and other resources; however, given the fast changing landscape, we strongly recommend you confirm your individual journal’s policy before submitting your publication–even if you have published in that journal recently.

SpringerNature; Wiley; PLOS; Elsevier; Taylor & Francis; SAGE; ICMJE

Open Access Publishing

Many authors choose to publish journal articles or books that are available to any reader at no cost (i.e. “open access” publications). In these cases, the publishing costs are often the responsibility of the authors themselves. 

The U-M Library has negotiated deals with many scholarly publishers to provide discounts for authors on article processing charges and also offers up to $15,000 for open access monographs in the humanities.

Questions?

We are interested in hearing from you! If you have any questions or relevant information or resources to add to this site, please complete this form.