Harvesting DataCite DOI Metadata
The DataCite Metadata Store is the collaborative effort of hundreds of organisations across the world. Metadata is created by DataCite Members using DataCite's Metadata Schema. This guide is intended to provide information for anyone wishing to harvest DataCite metadata from the DataCite Metadata Store.
What do we mean by “harvesting”?
Harvesting metadata refers to the automated process of collecting and aggregating metadata records from the DataCite Metadata Store. This can be done using protocols such as OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) or APIs.
This process allows external services, research institutions, and data discovery platforms to access structured information about research outputs, including datasets, software, and many more. By harvesting metadata, organizations integrate DataCite metadata into global research infrastructures.
Who can harvest DataCite metadata?
All DataCite metadata is openly available with a CC0 license, meaning it is available to everyone. The DataCite Metadata Store includes all DOIs and deposited metadata in our database and to the extent possible under law, DataCite e.V. has waived all copyright and related or neighboring rights. However, harvesters should be familiar with DataCite's DataCite Data File Use Policy.
How can I harvest DataCite Metadata?
Below are the main DataCite services for harvesting DataCite DOI metadata:
REST API: The DataCite REST API is DataCite's primary API and enables retrieval, creation, and update of DataCite DOI metadata records and account information. The REST API requests can include various filters and parameters to allow for limiting the results based on specific criteria.
OAI-PMH: This DataCite service exposes metadata stored in the DataCite Metadata
Store (MDS) using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
Public DataCite File: The Public Data File contains metadata for all publicly available DataCite DOIs. An updated version of the public data file is released annually. Each annual public data file contains all DOIs that were registered up to the end of the year.
Related resources:
How can I detect removed or retracted records with the REST API?
How can I harvest metadata in XML format?
How can I make my DOI list query more efficient when using the REST API?
Keep up to date with the latest information about harvesting metadata by joining the Harvesters Interest Group
Please get in touch if you have any questions at [email protected]
Updated about 20 hours ago