Skip to Main Content

Research Data Management

Documentation and metadata

Data documentation refers to description of research data, i.e. elaborating on what the datasets consist of, how they have been created, and how they can be used. Such descriptive information is referred to as metadata. Documentation helps us to understand, interpret and use data during and after the research process. High-quality documentation is part of responsible conduct of research and ensures findability of the data and render it possible to refer to it.

Data documentation helps you describe, both for yourself and others, what your data is about. Plan data documentation as carefully as possible and start it early in the research process. Carefully planned and implemented documentation facilitates research work, and it is difficult, if not impossible, to do afterwards.

Accuracy, extent and implementation of the documentation depend on the quantity and nature of the research data. During the research process, documenting focuses on content and facts that are relevant to conducting the research, such as variables and data collection methods. After the research process, a description of the data at the publication stage will be drafted. This includes, for example, details regarding the location and user rights related to the data. The metadata related to the research data can be published, even if the datasets themselves are not published.

Best practices of dokumentation

  • Naming files and folders: create consistent file- and folder-naming conventions and follow them throughout the research process. Discuss with your research group and agree on consistent naming practices.
  • Folder structure: design a logical folder structure that is suitable for the research data.
  • README files: create a text file that contains the dokumentation.
  • Version control.
  • Metadata standards: Especially if the quantity of research data is large or the documentation is extensive, we recommend using a metadata standard suitable for the data. If you know where the data will be published, check the requirements of the archive in question.
  • Database and data management software: use existing software that also often produce metadata automatically.
  • Glossaries: create a glossary in which e.g. the variables, terminology and abbreviations are explained.

Metadata standards

Metadata standards are models for describing research data. Many fields of research have their own metadata standards and archives receiving research datasets also often use a particular standard. For further information on different metadata standards, please consult the following resources:

README files

Below you will find general instructions on the items the descriptions of the research project and the data should include regardless of the discipline to which the research belongs. This information should be attached as a readme.txt or a comparable file to the research datasets.

  • TITLE: Name of the dataset or research project that produced it.
  • CREATOR: Names and addresses of the organization or people who created the data.
  • DESCRIPTION of the data package and folder overview.
  • LOCATION: Where the data relates to a physical location, record information about its spatial coverage.
  • METHODOLOGY: How the data was generated, including equipment or software used, experimental protocol, other things you might include in a field notebook.
  • INFORMATION ABOUT DATA FILES
    • IDENTIFIER: Number used to identify the data files, even if it is just an internal project reference number.
    • LOCATION: Where to find data files and additional information such as the data dictionary explaining variables used.
    • DATES: Key dates associated with the data, including project start and end date, data modification data release date, and time period covered by the data.
    • SUBJECT: Keywords or phrases describing the subject or content of the data.
    • FILE FORMATS: What file formats have been used.
  • FUNDERS: Organizations or agencies who funded the research
  • RIGHTS: Any known intellectual property rights held for the data
  • LANGUAGE: Language(s) of the intellectual content of the resource, when applicable.

Naming files and folders

Decide on the file- and folder-naming practices early in the research project. The plan should be accurate and extensive enough to cover the needs of the entire research process. The purposes of the naming plan are:

  • to create human readable filenames that indicate file content and are easy to understand for you
  • to create human readable filenames that help other users of the files to easily interpret what information each file includes
  • to ensure that filenames are computer readable
  • to keep all of your research data organised and in a logical order.

Make sure that you are not using personal data or any other sensitive data in naming files or folders.

Guides and tools