In all likelihood, you are already capturing minimal, basic metadata about your research. Your lab notebooks and research files hold much if not all of this information, such as:
The key is to collect all the necessary information (metadata) as you work and then link that metadata to the data files themselves.
If you are the only person using these data, the metadata may not need to be highly structured in order to be useful. However, the metadata should be fairly complete. This will help you later in referring back to these files. It will also make the future structuring of your metadata into a formalized standard easier and less painful.
Consider one or more of these methods for tracking metadata and data files:
You may not need your metadata to be very structured in order to understand the contents of your files right now. However, including as much structure as you can may help you better or more quickly understand the data in the future. It will also help others understand your data without requiring assistance or explanations directly from you.
You can add structure to your metadata by creating fields that meet your immediate needs and storing the metadata in a text file. Below is one example of some basic structured information about a particular experiment. This tabular structure could easily be stored as a .csv file.
Researcher | Jane Doe |
Project | Analysis of the ubiquitin-mediated degradation of S. cerevisiae SIC1p |
Experiment | Western blot to verify expression of GAL-SIC1p-HA fragments following galactose induction |
Date | March 15, 1997 |
Induction Time | 3 hours |
Induction Temperature | 30°C |
Background Strain | EY957 |
Background Strain Source | John Smith |
Antibody | 3F10 |
Antibody Source | Boehringer-Mannheim |
Detector | Molecular Dynamics Storm 860 phosphorimager |
For this case study we will use as an example a set of weather station data that is collected at Hopkins Marine Station. Two weather stations, like the one shown in the photo below, record wind, temperature, and precipitation values every ten minutes, 365 days a year. These data are stored in tabular files with 20 or so columns.
The Marine Life Observatory used to provide on its website a brief paragraph description of this project to help you understand its purpose and scope. This description served as a basic form of metadata about this project, but did not include many critical details.
By taking these descriptions a step further, researchers at Hopkins have created a more formalized set of metadata, shown below, for the project. Sixteen specific fields provide more details about the project than were included in the web page. This project metadata form can be used as a template to describe other projects as well, since all of these fields are fairly general in nature.
But in order for the data files themselves to be usable by other researchers -- or even in the future by those who collected the data -- exact descriptions need to be provided for the contents of each column in the data file. The document shown below contains file metadata describing the contents of each column, including the frequency of measurements, units of measure, etc.
All of these forms of metadata can be assembled yourself and will make your data much more understandable to you in coming years, to members of your research group that want to see what you did, or to other researchers that want to reference or reuse your data. None of these methods discussed require the use of metadata schema or standards, but could easily have incorporated controlled vocabularies or ontologies.