Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

During the project

 
During the research project data will be collected and registrations of the method have to be documented.
Active dataIn which way are collections and methods organised? (Hint: Organizing files and folders) 
 How much storage space will the data consume? 
 Which data formats get used (ARC, WARC, CDX, PDF, CSV, ...)? 
Version ControlWhich files get version controlled (sensitive data are not allowed to be that)? 
 Which version control system is used? 
 Which version of the version control system is used? 

Durable formats

Endorsed and published by standards agencies (http://www.digitalpreservation.gov/formats/fdd/descriptions.shtml, http://www.digitalpreservation.gov/formats/) 
 Publicly documented, i.e. complete authoritative specifications are available 
 Widely used and accepted as best practice 
Data validation and authenticationCurrent data volume - total size in MB/GB/TB - and likely rate of growth 
 Number of files and folders, and how they are organised 
 Platform - Mac/windows/Linux 
 Applications used to access and work with your data 
 Frequency of update, e.g. working data that changes daily, or data from project that needs to be retained but would not be used often 
 Data type(s): spreadsheets, database, documents, images, datasets, etc. 
 Any special security needs, e.g. personal data, commercial potential 
 Access control: Who needs access to which areas? Do they have access to Netarkivet? If not, where are they from and who are they, e.g. journalist, lawyers, journals etc. 
BackupIs there a backup strategy? 
 

How many copies are there?

 
 Are they placed in another place than the main data storage? 
 Are they stored securely (for instance sensitive data)? 
 On which devices are they placed? 
 

How prone is the device to writing errors?

 
 Is there a plan for periodically 'refresh' the data (i.e. copy to a new disk, USB stick, or portable drive)? 

Organizing and documenting data

 
 
You should create and maintain sufficient documentation or metadata (i.e. structured information about the data) to enable research data to be identified, discovered, associated with its owners and creators, linked to other related data or publications, contextualised in time and space, and to have the quality of the data assessed and research results validated.
If you poorly document your data, it will be difficult (or impossible) to find it and manage it in the longer term. Even if you (or others, in future) can find the data, its value will be diminished if it is hard to interpret. You should always ensure that protocols are agreed early in the project and adopted by all researchers consistently. 

Metadata standard

What metadata standard is chosen?

 

 
   
   
   
Metadata and Recordkeeping

Metadata or data documentation is critical to every research project. Appropriate records must accompany all data throughout the research cycle, continuing into the inactive storage stage, and be included in the metadata.

Data documentation explains how data was created or digitized, what data means, what its content and structure is, and any manipulations that may have taken place. This ensures that data can be understood during research projects, that researchers continue to understand data in the longer term and that re-users of data are able to interpret the data. Good documentation is also vital for successful data preservation.

...

File naming

Digital file names can be important for identifying and finding digital files. You should develop file naming conventions early in a research project, and agree on these with colleagues and collaborators before data is created. (Hint: Organising data: file naming) 
Controlled vocabularies

What vocabulary is used?

A vocabulary sets out the common language a discipline has agreed to use to refer to concepts of interest in that discipline. It models the concepts in a discipline by applying labels to the concepts and relating the concepts to each other in a formal structure. Vocabularies take many forms. They include glossaries, dictionaries, gazetteers, code lists, taxonomies, subject headings, thesauri, semantic networks and ontologies. Wherever possible, you should use an existing controlled vocabulary. Even if you need to adapt or customise an existing standard, this is preferable to creating something from scratch. (Vocabularies and research data)