Services
Good data deserves to be remembered. The best way to keep data from being lost is to make sure it is properly archived. Many important data have been lost over time as storage devices become corrupted and fail, are lost, or are destroyed.
Research data repositories
Wherever possible, the best way to archive important research data is to upload it to a reputable data repository. When data is in your custody as a researcher, you often have to pay for storage and maintain appropriate regular backups to ensure files are not corrupted. Depositing in a research data repository means the repository curators take care of your dataset...so you don't have to! If you're depositing data, you can also easily make it openly available or share it with restrictions. Even if your dataset is too sensitive to be published openly, it can often be de-identified and deposited under restricted access to archive it.
Careful long-term storage
However, sometimes data is so sensitive it can't be shared publicly ever, even in a de-identified format. Even if you're not able to share or publish data openly, it's important to make sure data is carefully archived once a project is complete. You don't want to have a dusty hard drive somewhere full of disorganized files that is confusing when you open it up again. Hard drives and external storage devices also have lifetimes and will eventually fail and aren't suitable for long term archival. Our Research Data Storage Finder Tool has an overview of storage possibilities.
Before archiving data for long term storage, there are some preparations to make to your files:
- Ensure that the data is stored in sustainable, open file formats. Open file formats help ensure access to your data over the long term as proprietary software can disappear when companies go out of business.
- Clean up your files. Delete any files that are no longer relevant (provided you don't need to keep them for ethics purposes) and organize your files into understandable file names and folders to create an archive-ready package.
- Data should be well documented and easily understandable. Research data should be packaged with metadata, so-called ‘data about data’. These files help your future self unpack and understand your files and how they were created. Data documentation is commonly included in readme files, codebooks, or data dictionaries. Metadata should include information about:
- File title, file format, language, creator, and date
- Data variable descriptions, including data type, allowable values, and calculations used (if applicable)