Skip to Main Content

Research Data Management How to select a data repository

The 4 Ws of sharing data

Why should I submit my data to a repository?

There are many benefits to submitting data to a repository including long-term preservation of data, an avenue to share null data or other data from a study that did not result in publication, increased visibility of data for reuse, and generation of a DOI for datasets to track usage and citations.

What can I submit?

  • Appropriate data to validate and replicate findings from published studies
  • Data from a study even if not directly linked to a publication
  • Null findings that do not result in a publication

For example, all the data that you traditionally shared in the supplemental files of a manuscript can be maintained in a repository.

When do I submit?

It depends. Some publishers require submission of data to a repository either before or during the manuscript submission process. Note: to comply with the 2023 Data Management and Sharing Policy data must be made available at the time of publication or at the end of the funding period, whichever comes first.

Where should I submit?

Selecting a repository is a personal decision and no single repository will fit all needs. Luckily, for most publishers and grant submissions, you have the freedom to select a repository that suits your needs.

  • List of NIH supported, open, subject specific data sharing repositories. These are open databases for submitting and accessing data.
  • DataONE multiple member repositories for Earth and environmental data.
  • If there is not a subject-specific repository that fits your needs. Checkout the Generalist repositories. (More details below)
  • Or use this interactive repository finder to narrow down your options

For a broader list of repositories, check out a repository database such as:

Generalist repositories

Broadly accept data regardless of data type, format, content of disciplinary focus. A comparison chart can help you determine which generalist repository fits your needs.

Desirable characteristics for data repositories

This table of characteristics is based on the Supplemental Information to the NIH Policy for Data Management and Sharing: Selecting a Repository for Data Resulting from NIH-Supported Research. The policy also includes seven additional desirable characteristics for repositories that store human data.
Persistent unique Identifiers
Assigns datasets to a citable PID to support data discovery and reporting
Long-term sustainability
Long-term plan for managing data; builds on stable technical infrastructure & funding; contingency plans for
unforeseen events
Metadata Ensures datasets are accompanied by sufficient metadata to enable discovery, reuse, and citation
Curation & quality assurance Provides expertise to improve the accuracy and integrity of datasets and metadata
Free and easy access Maximizes timely open access to datasets and their metadata free of charge, consistent with legal and ethical limits, Tribal sovereignty, and protection of other sensitive data
Broad and measured reuse Makes datasets and their metadata available with broadest possible terms of reuse, and provides the ability to measure attribution, citation, and reuse of data
Clear use guidance Provides accompanying documentation describing terms of dataset access and use
Secure Has documented measures in place to prevent unauthorized access to, modification of, or release of data, with levels of security commensurate with the sensitivity of data
Confidentiality Has documented capabilities for ensuring that administrative, technical, and physical safeguards are employed to comply with applicable confidentiality, risk management, and continuous monitoring requirements for sensitive data
Common format Access to data and metadata from the repository is formatted in widely used, non-proprietary formats 
Provenance Records the origin, chain of custody, and any modifications to submitted datasets and metadata
Retention policy Provides documentation on policies for data retention within the repository