
Datasets can be made publicly available in three ways:
- In the article itself: for small datasets that can be presented in full in a table.
- In the supporting information: for medium-sized datasets that can be presented in large tables or compressed files, which can be downloaded online from the journal website.
- In a data repository: for large datasets (e.g., DNA sequences) that need large database infrastructures to store them.
Are there exceptions to these mandatory policies?
In certain cases, datasets are too large or the data are human patient data, which cannot be made publicly available for ethical reasons. In such cases, it is recommended you contact the target journal to discuss solutions to these issues [1].Which repository?
Many journals provide lists of recommended subject-specific repositories. A good example can be found PLOS ONE Data Availability Policy. Alternatively, you can search for appropriate repositories using the registry re3data.org.Are there costs?
The cost of depositing data in a repository varies. Dryad charges $120 per dataset (<20 GB); however, they have a waiver for countries with low-income economies [4], and both Nature and some Royal Society journals (Biology Letters, Proceedings B and Royal Society Open Science) cover the cost of depositing the data (<20 GB) in both Dryad and Figshare [5] (two large generalist repositories). When it comes to data sharing, it is better to provide as much information as possible. The open transparent sharing of data not only benefits the scientific community but wins the favour of public taxpayers.References
- Plus One. (2016) Data Availability. Plos One. Retrieved from http://journals.plos.org/plosone/s/data-availability on 15 November 2016.
- Wozniak, M.B., Scelo, G., Muller, D.C., Mukeria, A., Zaridze, D. and Brennan, P. (2015) Circulating microRNAs as non-invasive biomarkers for early detection of non-small-cell lung cancer. Plos One 10(5), p.e0125026.
- Butler, T.M., Johnson-Camacho, K., Peto, M., Wang, N.J., Macey, T.A., Korkola, J.E., Koppie, T.M., Corless, C.L., Gray, J.W. and Spellman, P.T. (2015) Exome sequencing of cell-free DNA from metastatic cancer patients identifies clinically actionable mutations distinct from primary disease. Plos One 10(8), p.e0136407.
- Dryad. (2016) Data publishing charges. Dryad. Retrieved from http://datadryad.org/pages/payment on 15 November 2016.
- The Royal Society. (2016). Data sharing and mining. The Royal Society. Retrieved from https://royalsociety.org/journals/ethics-policies/data-sharing-mining/ on 15 November 2016.