In order to submit your research to a data repository, you may be required to format your metadata using a metadata standard. Consult the repository you will be using to determine what their metadata requirements are.
Metadata structures are often referred to as "schema." The schema will have a defined set of characteristics for describing the data. The completed metadata are often reported in a machine-readable language such as JSON or XML.
If you are not using a standard metadata schema whose details are widely known and easily accessible to other researchers, be sure that you preserve the schema itself and its documentation, along with the data and metadata. By doing so, you will help ensure that you and others are able to fully understand and reuse your data in the future.
|Contributor||An entity responsible for making contributions to the resource.|
|Coverage||The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.|
|Creator||An entity primarily responsible for making the resource.|
|Date||A point or period of time associated with an event in the life cycle of the resource.|
|Description||An account of the resource.|
|Format||The file format, physical medium, or dimensions of the resource.|
|Identifier||An unambiguous reference to the resource within a given context.|
|Language||A language of the resource.|
|Publisher||An entity responsible for making the resource available.|
|Relation||A related resource.|
|Rights||Information about rights held in and over the resource.|
|Source||A related resource from which the described resource is derived.|
|Subject||The topic of the resource.|
|Title||A name given to the resource|
|Type||The nature or genre of the resource|
The following are several well-known and frequently-used metadata standards.
Ontologies are shared vocabularies that are used to describe components of a particular discipline and the relationships among these components. By using ontologies, you make it easier for others (or even the future you) to understand your data. Controlled vocabularies, on the other hand, are merely lists of predefined, authorized terms.
In addition to using a metadata standard, you may wish (or be required) to use ontologies or controlled vocabularies to create your metadata. For example, if you use the Dublin Core as your metadata schema, they recommended that you use the Internet Media List, a controlled vocabulary, to enter information in the "Format" label. It is also recommended that you use a controlled vocabulary to enter the subject terms, but it is up to you to choose which vocabulary to use.
Here are some examples of ontologies and controlled vocabularies currently in use in a variety of disciplines:
Domain-specific repositories, such as the Protein Data Bank (PDB), often require the submission of highly structured metadata along with data files. This is what enables users to perform specialized searches within these data repositories. For example, in PDB you can search for all the ligases from mice that were determined by X-ray crystallography at a resolution of 2.5 Angstroms or better. If everyone submitted data in whatever format they wanted, this kind of searching would not be possible.
The image below shows a very small part of the metadata file for the crystal structure shown above. Some of these metadata files contain over 20,000 lines, many of which contain structure information generated during the experimental data capture. You can see that the metadata file includes specific categories that are filled in with specific data in defined formats.