RDM Phase 2

Remember: Managing data in a research project is a process that runs throughout the project. Good data management is one of the foundations for reproducible research. Good management is essential to ensure that data can be preserved and remain accessible in the long-term, so it can be re-used and understood by future researchers. Begin thinking about how you’ll manage your data before you start collecting it.

DataLifeCycle

 

A storage place has to be found for research data and primary materials, after appropriate consideration of location, security and adequate environmental control.

Researchers will create and maintain throughout the research project, full and accurate records of the research methods and data sources used, by way of notes/diary entries and laboratory books etc.

All data and materials must be afforded due care and protection. Regardless of the format of the materials/data, they must be protected from damage and handled with care. Data or materials must not be removed from the storage provided without authorization during the active phase.

Sensitive records/data will be appropriately protected from unauthorized access.

Backup

Researchers and research administrators must have a backup strategy to recover data after loss and/or to recover data from a particular time. It may not be possible to store the data in the State and University Library. In such cases a suitable place has to be found for backup of the data. Backups of more than one (1) copy should be performed regularly, and should be housed remotely from the main data storage. The backups should be labelled and well organised to facilitate any data restoration process.

Backup security requires further mention. If the data is sensitive then it should not be stored on a computer that is connected to the internet, and preferably not connected to any network. If the data needs to be destroyed at the end of a project then consider what level is required – a hard drive will need to be overwritten several hundred times to ensure that no data can be recovered. Very high-level security institutions, such as Defence, require hard-disks to be physically destroyed and optical discs to be shredded.

The lifetime of backups should also be considered. Burned optical discs have average lifetime of two years, and five years if kept in a cool dark place.

It is a good idea to check with the IT staff to find out how often they backup, what is the maximum amount of data they can backup, and how long they keep old backups.

You may need to maintain your own backups if:

  • There are no services available to you
  • You have valuable data that you do not trust with other people
  • You have sensitive data that you cannot store on unsecure computers (medical records, data for defence projects, etc.)
Version Control

Throughout the course of the research data lifecycle multiple versions of documents or files can be created and mechanisms must be put into place to decipher between the different versions.

Version control management can be achieved through:

  • research data access and editing privilege control;
  • selecting one individual to handle all manual editing of data; and
  • the use of software such as the GitHub

Version control ensures maintenance of a master file which documents all versions and all changes that are made to the research data.

Researchers must also consider the longevity of the software/hardware required to create and analyse research data. It is important to include documentation relating to software/hardware requirements in the Research Data Management Plan. It may be necessary to include a copy of the software version including any related metadata together with the research data (depending on software licensing conditions).

Data Validation and Authentication

Your data will be used to obtain the results and conclusions of your research, so it is important to ensure its accuracy. Your data may also become an important dataset that is used by many others, so errors have the potential to hinder many research efforts.

It is therefore important to set up policies and practices to ensure the accuracy and authenticity of your data. 

Access Controls

Well-defined access controls help you comply with privacy and confidentiality policies and help maintain data authenticity by limiting who can modify data. The access controls may change throughout the life of the research project. Initially all data will usually be restricted to the research group, when the results are published the data may then be made available to other researchers.

Metadata and Recordkeeping

Metadata or data documentation is critical to every research project. Appropriate records must accompany all data throughout the research cycle, continuing into the inactive storage stage, and be included in the metadata.

Data documentation explains how data was created or digitised, what data means, what its content and structure is, and any manipulations that may have taken place. This ensures that data can be understood during research projects, that researchers continue to understand data in the longer term and that re-users of data are able to interpret the data. Good documentation is also vital for successful data preservation.

Metadata is typically used for resource discovery, providing searchable information that helps users to find existing data, as a bibliographic record for citation, or for online data browsing. Researchers and research administrators must plan and record a process for recording data in the Data Management Plan before data collection begins. This will make data documentation easier and reduce the likelihood that aspects of the data are forgotten later in the research project.

Destruction

Research data that is scheduled to be destroyed must be reviewed and authorized for destruction by the data owner. Data must not be destroyed without written authorization and documentation of the data and the destruction processes used. 


Links

Organising files and folders

Data Documentation