Skip to Main Content

Create metadata for your research project

How to create metadata for your research projects.

Create basic metadata

In all likelihood, you are already capturing minimal, basic metadata about your research. Your lab notebooks and research files hold much if not all of this information, such as:

  • Researcher name
  • Date
  • Project
  • Details of the experiment/analysis being run, including the purpose and methods used
  • Sources of other data used in the experiment/analysis

The key is to collect all the necessary information (metadata) as you work and then link that metadata to the data files themselves.

If you are the only person using these data, the metadata may not need to be highly structured in order to be useful. However, the metadata should be fairly complete. This will help you later in referring back to these files. It will also make the future structuring of your metadata into a formalized standard easier and less painful.

Track metadata 

Consider one or more of these methods for tracking metadata and data files:

  • Keep a notebook with information about your projects, noting the locations and names of digital files associated with individual experiments. Include actual links to the files if your notebook is digital.
  • Include a note in each data file that indicates the location of metadata.
  • In each folder on your computer that contains research data, include a text file that describes the contents of the files in that folder, including explanations of abbreviations and column headers in the files. You might also want to include references to publications that describe the data.

You may not need your metadata to be very structured in order to understand the contents of your files right now. However, including as much structure as you can may help you better or more quickly understand the data in the future. It will also help others understand your data without requiring assistance or explanations directly from you.

You can add structure to your metadata by creating fields that meet your immediate needs and storing the metadata in a text file. Below is one example of some basic structured information about a particular experiment. This tabular structure could easily be stored as a .csv file.

Researcher Jane Doe
Project Analysis of the ubiquitin-mediated degradation of S. cerevisiae SIC1p
Experiment Western blot to verify expression of GAL-SIC1p-HA fragments following galactose induction
Date March 15, 1997
Induction Time 3 hours
Induction Temperature 30°C
Background Strain EY957
Background Strain Source John Smith
Antibody 3F10
Antibody Source Boehringer-Mannheim
Detector Molecular Dynamics Storm 860 phosphorimager

Case study

For this case study we will use as an example a set of weather station data that is collected at Hopkins Marine Station. Two weather stations, like the one shown in the photo below, record wind, temperature, and precipitation values every ten minutes, 365 days a year. These data are stored in tabular files with 20 or so columns.

Hopkins weather station

The Marine Life Observatory used to provide on its website a brief paragraph description of this project to help you understand its purpose and scope. This description served as a basic form of metadata about this project, but did not include many critical details.

By taking these descriptions a step further, researchers at Hopkins have created a more formalized set of metadata, shown below, for the project. Sixteen specific fields provide more details about the project than were included in the web page. This project metadata form can be used as a template to describe other projects as well, since all of these fields are fairly general in nature.

Hopkins weather station project metadata

But in order for the data files themselves to be usable by other researchers -- or even in the future by those who collected the data -- exact descriptions need to be provided for the contents of each column in the data file. The document shown below contains file metadata describing the contents of each column, including the frequency of measurements, units of measure, etc.

File metadata

All of these forms of metadata can be assembled yourself and will make your data much more understandable to you in coming years, to members of your research group that want to see what you did, or to other researchers that want to reference or reuse your data. None of these methods discussed require the use of metadata schema or standards, but could easily have incorporated controlled vocabularies or ontologies.