Research Data Stewardship Resources
RDSI Informational Webinar
Why research data stewardship is important
There are many reasons why sharing research data is important, and can provide not only benefit to research (and researchers) but the public as well.
Benefits of Research Data Stewardship and Sharing
- Making data publicly available can improve reproducibility and replicability (National Academies of Sciences, Engineering, and Medicine)
- Data that are Findable, Accessible, Interoperable, and Reusable (FAIR) helps producers and users of research data (GO FAIR Initiative)
- Responsible stewardship of research data can help address many DEI issues that arise during research projects (Principles for Advancing Equitable Data Practice)
- Accessible data inspires more public trust in research (Pew Research Center)
- Papers that link to accessible research data get cited more (PLOS One)
- Ten Reasons to Share Your Data (Nature Index)
- Ten Simple Rules for Improving Research Data Discovery (PLOS Computational Biology)
- Why Share Your Data? (U.S. Geological Survey)
- Sharing Health Data: The Why, the Will, and the Way Forward (National Academy of Medicine)
- Public Access to Research Data (Association of Public and Land Grant Universities)
Funding Agency Research Data Policies
U.S. funding agency requirements for public access of research data have been evolving rapidly over the past several years. Below are links to a number of individual policies; however, given the fast changing landscape around agency policies, we strongly recommend you confirm your funder’s policy before submitting any proposal. For other agencies not in this list, see a curated list of data sharing requirements by federal agency provided by the Scholarly Publishing and Academic Resources Coalition (SPARC).
Federal Funding Agency Research Data Policies and Related Information
|Agency||Policy||Effective date||Further information|
|NIH||Policy for Data Management and Sharing||Jan 25, 2023|
|NSF||Data Sharing Policy||June 1, 2020||Dissemination and Sharing of Research Results from NSF|
|DOD||Plan to Establish Public Access to the Results of Federally Funded Research||Feb 2015|
|Dept of Energy||Policy for Digital Research Data Management||Oct 1, 2015||DOE Public Access Plan|
|Dept of Education||IES Policy Regarding Public Access to Research||October 21, 2016||Implementation Guide for Public Access to Research Data|
|NEH||Data Management Plans for NEH Office of Digital Humanities Proposals and Awards||June 2018|
|NASA||Scientific Information policy for the Science Mission Directorate [DRAFT]||Nov 2021||Open-Source Science Initiative|
There are a number of specialized terms that apply to the research data lifecycle. Below are some of the most important terms with common definitions.
Even the term “data” itself carries a wide range of meanings depending on discipline and research method. According to NIH.gov, research data includes the recorded factual material commonly accepted in research communities as necessary to validate and replicate findings, regardless of whether the data are used to support scholarly publications. This definition would exclude preliminary analyses, completed case report forms, drafts of publications, plans for future research, peer reviews, or communications with colleagues. However, in some disciplines, research data can more broadly include physical objects and other information, such as specimens, archival materials, collections, and notebooks, that help support the provenance of the data.
Data Management Plan (DMP)
A document describing the actions to be taken over the course of the life cycle of a research dataset to ensure that it is well managed, and will eventually be findable, accessible, interoperable and reusable by others. These are required in proposals to federal funding agencies.
Digital Persistent Identifier (DPI)
These enable consistent citation and reuse of scholarly works, datasets and funding sources, most commonly using the digital object identifier (DOI) system. Repositories assign DOIs when datasets are deposited. Datasets are citable in scholarly works using DOIs similar to any other reference. Definition adapted from DataCite.org.
An increasingly common term describing efforts to make research data more Findable, Accessible, Interoperable, and Reusable (FAIR). Definition adapted from Go-Fair.org.
Information about a research data set that is structured (often in machine-readable format) for purposes of search and retrieval. Metadata elements may include basic information (e.g. title, author, date created, etc.) and/or specific elements inherent to datasets (e.g., spatial coverage, time periods, provenance). Definition adapted from DataCurationNetwork.org.
A general term typically used to describe free access to datasets or publications with no restrictions on accessibility, (re)use, and redistribution (distinct from public access, below). Definition adapted from UNESCO.org.
The series of managed activities necessary to ensure continued access and readability to research data sets for as long as necessary with adequate security and risk mitigation. Definition adapted from DPCOnline.org.
Data are readily discoverable and accessible to other researchers and people outside of the research project in which the data were generated. Public access is distinct from open access, defined above, in that something may be publicly available but may also have some restrictions on accessibility, reuse, or redistribution. Information on federal agencies’ public access policies can be found in the proposal development section of the guidance and policies page.
A specialized database that preserves, stewards, and provides access to many types of digital datasets in a variety of formats. Data repositories may focus on a specific field (such as ICPSR for Social Sciences), an institution (such as Deep Blue Data for U-M), or serve a general audience (such as Dryad). Definition adapted from CASRAI.org.
Relevant U-M Policies/Guidelines
Several current university policies address aspects of research data use and management. U-M is currently assessing what clarifications are needed for our internal policies related to research data, but in the meantime, the below policies continue to apply.
Relevant U-M Policies
Communities of Practice
Several U-M groups are available where like-minded individuals can share information, experiences and ideas related to research data management and sharing.
U-M Communities of Practice
- The Coderspaces community provides a forum and office hours to assist faculty, staff, and students with research methodology, statistics, data science applications, and computational programming for research.
- The Data Analysis Networking Group (DANG!) is a forum for U-M post-docs, grad students, and other researchers to discuss how to analyze, present, and visualize their data.
- The U-M Software / Data Carpentries is building a community of excellence at U-M around the area of reproducible data analysis.
- MIDAS Reproducibility Hub promotes reproducible data science through raising awareness, celebrating best practices, enabling the scholarly investigation of reproducible research, and developing tools that can be widely adopted.