Research Data Stewardship Resources

RDSI Informational Webinar

Why research data stewardship is important

There are many reasons why sharing research data is important, and can provide not only benefit to research (and researchers) but the public as well.

Benefits of Research Data Stewardship and Sharing
Further Reading

Funding Agency Research Data Policies

U.S. funding agency requirements for public access of research data have been evolving rapidly over the past several years. Below are links to a number of individual policies; however, given the fast changing landscape around agency policies, we strongly recommend you confirm your funder’s policy before submitting any proposal. For other agencies not in this list, see  a curated list of data sharing requirements by federal agency provided by the Scholarly Publishing and Academic Resources Coalition (SPARC).

Definitions

There are a number of specialized terms that apply to the research data lifecycle. Below are some of the most important terms with common definitions.

Data

Even the term “data” itself carries a wide range of meanings depending on discipline and research method. According to NIH.gov, research data includes the recorded factual material commonly accepted in research communities as necessary to validate and replicate findings, regardless of whether the data are used to support scholarly publications. This definition would exclude preliminary analyses, completed case report forms, drafts of publications, plans for future research, peer reviews, or communications with colleagues. However, in some disciplines, research data can more broadly include physical objects and other information, such as specimens, archival materials, collections, and notebooks, that help support the provenance of the data.

Data Management Plan (DMP)

A document describing the actions to be taken over the course of the life cycle of a research dataset to ensure that it is well managed, and will eventually be findable, accessible, interoperable and reusable by others. These are required in proposals to federal funding agencies.

Digital Persistent Identifier (DPI)

These enable consistent citation and reuse of scholarly works, datasets and funding sources, most commonly using the digital object identifier (DOI) system. Repositories assign DOIs when datasets are deposited. Datasets are citable in scholarly works using DOIs similar to any other reference. Definition adapted from DataCite.org.

FAIR Data

An increasingly common term describing efforts to make research data more Findable, Accessible, Interoperable, and Reusable (FAIR). Definition adapted from Go-Fair.org.

Metadata

Information about a research data set that is structured (often in machine-readable format) for purposes of search and retrieval. Metadata elements may include basic information (e.g. title, author, date created, etc.) and/or specific elements inherent to datasets (e.g., spatial coverage, time periods, provenance). Definition adapted from DataCurationNetwork.org.

Open Access

A general term typically used to describe free access to datasets or publications with no restrictions on accessibility, (re)use, and redistribution (distinct from public access, below). Definition adapted from UNESCO.org.

Preservation

The series of managed activities necessary to ensure continued access and readability to research data sets for as long as necessary with adequate security and risk mitigation.  Definition adapted from DPCOnline.org.

Public Access

Data are readily discoverable and accessible to other researchers and people outside of the research project in which the data were generated. Public access is distinct from open access, defined above, in that something may be publicly available but may also have some restrictions on accessibility, reuse, or redistribution. Information on federal agencies’ public access policies can be found in the proposal development section of the guidance and policies page.

Repository

A specialized database that preserves, stewards, and provides access to many types of digital datasets in a variety of formats. Data repositories may focus on a specific field (such as ICPSR for Social Sciences), an institution (such as Deep Blue Data for U-M), or serve a general audience (such as Dryad). Definition adapted from CASRAI.org.  

Relevant U-M Policies/Guidelines

Several current university policies address aspects of research data use and management. U-M is currently assessing what clarifications are needed for our internal policies related to research data, but in the meantime, the below policies continue to apply.

Communities of Practice

Several U-M groups are available where like-minded individuals can share information, experiences and ideas related to research data management and sharing.

U-M Communities of Practice
  • The Coderspaces community provides a forum and office hours to assist faculty, staff, and students with research methodology, statistics, data science applications, and computational programming for research.
  • The Data Analysis Networking Group (DANG!) is a forum for U-M post-docs, grad students, and other researchers to discuss how to analyze, present, and visualize their data.
  • The U-M Software / Data Carpentries is building a community of excellence at U-M around the area of reproducible data analysis.
  • MIDAS Reproducibility Hub promotes reproducible data science through raising awareness, celebrating best practices, enabling the scholarly investigation of reproducible research, and developing tools that can be widely adopted.

Questions?

We are interested in hearing from you! If you have any questions or relevant information or resources to add to this site, please complete this form.