Research Data Stewardship FAQs
This seems focused primarily on STEM fields. What about the humanities or arts?
Most federal agencies require data management plans, including the National Endowment for the Humanities Office of Digital Humanities. The National Endowment for the Arts is also embracing public access initiatives, including supporting the National Archive of Data on Arts and Culture.
What does “research data” actually mean?
“Data” can mean a lot of things depending on the research context and discipline. According to the National Institutes of Health, research data includes the recorded factual material commonly accepted in research communities as necessary to validate and replicate findings, regardless of whether the data are used to support scholarly publications. This definition would exclude preliminary analyses, completed case report forms, drafts of publications, plans for future research, peer reviews, or communications with colleagues. But in some disciplines, research data also includes physical objects, such as specimens, archival materials, collections, and notebooks. If you have questions about what constitutes data in your field/project, or wonder what types of data should be shared (i.e. raw vs treated) in your specific context, contact your department’s librarian or the library’s Research Data Services at firstname.lastname@example.org
Are materials or other physical samples considered data? How do I make those publicly available?
The answer varies depending on the context. In some disciplines, agencies or journals will expect the permanent archiving of physical specimens and/or biological samples. U-M has a central biorepository and a number of world-class museums across the natural sciences that can assist researchers with depositing specimens into collections. Other repositories or practices may be commonplace in other disciplines. Be sure to consult with funder or journal policies to confirm expectations.
What should I do if I suspect research misconduct has taken place and resulted in compromised integrity of research data?
You should report your concerns to U-M’s research integrity office in the Office of the Vice President for Research (UMOR.Research.Integrity@umich.edu). In cases when research integrity allegations arise, the University may request original research data and other research records as evidence when those allegations are being reviewed.
Do policies or guidance differ for members of the research community depending on their role or rank?
In general, stewardship of research data is a shared responsibility of all researchers, regardless of their title. Students, trainees and staff should work with their advisor or the principal investigator (PI) on the project to determine best practices for individual projects or research groups. PIs have additional responsibilities required or implicated by applicable policy or agreements including but not limited to: ensuring proper management and retention of research data; creating and keeping sufficient records of data and procedures; protecting and maintaining confidentiality of sensitive research data; communicating expectations to the research team; ensuring compliance with program requirement; complying with applicable federal, state, local laws and regulations; and complying with University rules on the ownership of data, inventions or tangible research property. If a PI chooses to delegate responsibility (e.g., within a research group), the PI will remain accountable for the proper stewardship of research data as described above.
A data management plan is a useful document for specifying a project’s or research group’s data practices, including who is responsible for which aspects of the data and its care.
Does U-M have formal policy relating to its own administrative data?
Yes, SPG 601.12 addresses “institutional data,” which are data that may be relevant to planning, managing, operating, controlling, or auditing administrative functions of an administrative or academic unit of the University, among other things.
How is U-M working with other academic institutions on topics related to research data?
U-M has been actively engaging in research data stewardship issues with member organizations like the Association of American Universities (AAU), the Association of Public and Land Grant Universities (APLU), and the Association of Research Libraries (ARL). U-M participated in several workshops and discussions facilitated by APLU and AAU on accelerating public access to research data, and also joined the Higher Education Leadership Initiative for Open Scholarship (HELIOS).
What is the difference between “open access” and “public access”?
Open access generally describes datasets or publications that are freely available to users/readers with no restrictions on accessibility, (re)use, and redistribution. Public access generally refers to data or information that is readily discoverable and accessible to other researchers and people outside of the research project, but may also have some restrictions on accessibility, reuse, or redistribution. The U-M Library has a research guide on open access publishing.
Are there new expectations for making publications freely available or publishing in open access journals too?
There is no federal or U-M policy requiring researchers to publish in open access journals. However, some federal agencies and private funders have requirements for making publications open and/or available. Depositing pre-publication (accepted or submitted) versions of publications into preprint servers or other repositories (like NIH’s PubMed Central or U-M’s Deep Blue Documents) often satisfy these requirements. The U-M Library has a research guide on open access publishing.
For researchers wishing to explore more options for publishing their work in open access journals, the U-M Library now has agreements in place with several publishers to provide discounted publishing fees to U-M researchers.
How does software and/or code fit into public access requirements?
Many journals now require software and/or code for an analysis to be made available at the time of publication, alongside data. Consult these best practices for additional guidance and resources.
Researchers should understand a journal’s public access policy prior to submission so that any intellectual property concerns related to inventions and associated code, models and frameworks can be addressed ahead of time. Contact U-M Innovation Partnerships with any related questions about releasing code or open source software.
How am I supposed to share all of my data if I also am expected to protect it for various reasons (e.g., it potentially contains sensitive or personally identifiable material)?
Researchers must balance making data publicly available, while also providing necessary protections or restrictions on potentially sensitive research data. These restrictions are in place to protect human participants, data technologies subject to export control, and controlled unclassified information. Whenever possible sensitivities exist, refer to any standing data use agreements, IRB protocols, or other guidance related to your specific data needs. Failure to maintain protections or restrictions could expose the researchers, institution, and participants to significant legal, reputational, and security risks.
When writing a data management or sharing plan, be sure to provide details about any protections that will be in place to secure the data during your research project, and any restrictions that may limit your ability to share the data more broadly.
Are there different expectations for researchers sharing research data internally with other researchers at U-M and sharing it externally?
Data should be made available to all members of a collaborative team or project, contingent upon necessary training or approvals. Otherwise, there is no additional expectation that researchers can/should make their data more or less accessible to other researchers at U-M any differently than researchers at other institutions.
I’m concerned about my research being “scooped” if I’m required to share data more openly. What can I do to prevent this?
Someone else claiming priority to a research idea, or reusing data for subsequent analyses/publications without credit, is regularly brought up as a reason not to share data. However, sharing data through a trusted repository provides several ways to alleviate this concern. First, journals increasingly expect that the data that underlie research findings are shared concurrently with the publication of the findings, and publishing your data at that time eliminates the possibility of being scooped by someone else. Second, even if the data need to be shared before the research findings are published, most repositories offer the option of embargoing the data for a specified amount of time. Embargoing the data demonstrates compliance through creating a record of the data while withholding direct access to the data for a reasonable amount of time. Third, depositing data into a repository provides a means to register it, through providing a timestamp indicating when the data were available, as well as identifiers to enable others to cite the data appropriately. In the rare case that a question should arise about the reuse of the data, the owners of the original deposited data can use this information to demonstrate attribution for its generation or provenance by others.
Data Ownership & Retention
What should I do with my research data if I am planning to leave U-M?
Subject to any applicable restrictions (e.g., in sponsor agreements or law), ownership of the original data may be transferred from the University to a new institution with prior written approval of the PI and/or department chair, and any sponsor that requires approval. The new institution must guarantee acceptance of ongoing custodial responsibilities for the data, ensure U-M has access to the original data for any reason, and any relevant or appropriate confidentiality restrictions are maintained. Datasets containing directly or indirectly identifiable human subjects research data may not be transferred outside of the University without IRB review and approval. For students and other trainees under the supervision of a PI, original data is preferably retained at the University by the PI.
Who owns the research data I collected while working at U-M?
The University owns research data and materials generated for or by projects conducted at the University. Based on guidance from the Association of American Universities (AAU) and the Association of Public and Land-grant Universities (APLU), the university is working to clarify and reinforce data ownership policies expectations, including rights and responsibilities for the institution and researchers, in a new research data policy.
This is distinct from scholarly, academic, and artistic works, for which the university places copyright with the creators (see question below).
Do I hold the copyright to my research data?
No. The simplest way to think of the issue is that research data themselves are not copyrightable. Faculty own the copyright in Scholarly Works (defined in SPG 601.28) and have at least some right to use data in Scholarly Works, though the University retains ownership to research data and materials, as described in a prior FAQ.
More specifically, copyright, a form of intellectual property law, protects original works of authorship including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture. Copyright protects expression, and not ideas, procedures, methods, systems, processes, concepts, principles, or discoveries. To be protectable by copyright, a work must have “modicum” of creativity, and facts by themselves are not protected by copyright. Therefore, data, as a collection of facts, is not protected by U.S. copyright law. Databases as a whole can be protected by copyright as a compilation, as can some manners of presentation of data, but only under certain conditions.
How long must I keep a copy of my data?
There is currently no institutional policy on the retention of research data. Therefore, the default expectation is that researchers retain their research data in compliance with sponsor or publisher mandates. For NIH- and NSF-sponsored research, this is typically three years past the date of the submission of the final financial report or all required reports, respectively. Retention times for some other types of studies, including health information abstracted from non-clinical sources or dental research involving human subjects, may extend up to 10 years. Please check with your sponsor’s and/or publisher’s requirements to ensure that you are meeting compliance obligations.
If your project included a sponsor-approved data management plan, be sure to adhere to any timelines you included in your plan.