DataCite Event Data
Event Data is a joint service by Crossref and Datacite to collect and expose links to Crossref and DataCite DOIs. For DataCite DOIs Event Data provides links to other DOIs from DataCite, Crossref or other DOI registration agencies, as well as usage statistics sent to DataCite as usage reports.
The DataCite REST API provides a query API for the Event Data service, and allows users to retrieve events filtered by DOI or DOI prefix, source of the event, relation type of the event, and/or year and month the event occurred. See section Query Filters for details.
DataCite services that contain citation data rely on an external service, Crossref Event Data. Because of this dependency, citation data is not available in DataCite test environments, doi.test.datacite.org (Fabrica test), api.test.datacite.org (API test).
Linking Events
Linking events are relations between two DOIs, or a DOI and a URL. For DataCite DOIs these are described in DataCite metadata using the <relatedIdentifier>
property, and there is a controlled list of relation types that can be used (see section relation-type-id for details). Linking events describe a large number of relations, including
- citations
- versioning
- granularity (is part of / has part)
There is no single relation type to describe citations, and relation types relevant for citations are sometimes used differently across organizations. The most commonly used relation types for citations are references
, documents
, cites
, and is-supplemented-by
.
Usage Events
Usage events (views and downloads of the content associated with a DOI) are provided by the datacite-usage
source, and are generated from usage reports in the standard SUSHI format sent to DataCite. Usage reports for datasets are generated using the Code of Practice for Research Data Usage Metrics and the SUSHI specification for research data usage metrics.
The usage reports summarize all usage events for a given month, and break them down into three categories:
- total vs. unique: for unique usage events accesses are only counted once per content item if they are within a unique user session.
- access method: track content usage by machines. The access method can be
regular
ormachine
. - metric type: activities where content was retrieved (
requests
) or information about content (e.g. metadata) was examined (investigations
).
With these three categories there are 8 (2 x 2 x 2) relation types for usage events (see relation-type-id below).
Query Filters
The following filters are available in the Event Data Query API:
query
Query for any event information.
subj-id
The identifier for the event subject, expressed as URL. For example https://doi.org/10.7272/q6qn64nk
.
obj-id
The identifier for the event object, expressed as URL. For example https://doi.org/10.7272/q6qn64nk
.
doi
The subj-id
or obj-id
of the event, expressed as DOI. For example 10.7272/q6qn64nk
.
prefix
The DOI prefix of the subj-id
or obj-id
of the event. For example 10.7272
.
year-month
The year and month in which the event occurred, in the format YYYY-MM
. For example 2018-08
.
source-id
source-id | description | provided by |
---|---|---|
datacite-usage | Usage Events for DataCite DOIs | Usage Reports submitted to DataCite |
datacite-related | DataCite DOI as related identifier from DataCite metadata | DataCite |
datacite-crossref | Crossref DOI as related identifier in DataCite metadata | DataCite |
datacite-kisti | KISTI DOI as related identifier in DataCite metadata | DataCite |
datacite-op | OP DOI as related identifier in DataCite metadata | DataCite |
datacite-medra | mEDRA DOI as related identifier in DataCite metadata | DataCite |
datacite-istic | ISTIC DOI as related identifier in DataCite metadata | DataCite |
datacite-funder | Crossref Funder ID as funder identifier in DataCite metadata | DataCite |
datacite-url | URL as related identifier in DataCite metadata | DataCite |
crossref | DataCite DOI in Crossref Metadata | Crossref |
relation-type-id
datacite-usage
events use one of these relation-types:
- total-dataset-investigations-regular
- unique-dataset-investigations-regular
- total-dataset-requests-regular
- unique-dataset-requests-regular
- total-dataset-investigations-machine
- unique-dataset-investigations-machine
- total-dataset-requests-machine
- unique-dataset-requests-machine
For events generated from DataCite metadata, the relationType
from the DataCite Metadata Schema is used:
- Is-cited-by
- cites
- is-supplement-to
- is-supplemented-by
- is-continued-by
- continues
- describes
- is-described-by
- has-metadata
- is-metadata-for
- has-version
- is-version-of
- is-new-version-of
- is-previous-version-of
- is-part-of
- has-part
- is-referenced-by
- references
- is-documented-by
- documents
- is-compiled-by
- compiles
- is-variant-form-of
- is-original-form-of
- is-identical-to
- is-reviewed-by
- reviews
- is-derived-from
- is-source-of
- is-required-by
- requires
datacite-funder
uses the relation type is-funded-by
. crossref
uses the relation type references
.
Pagination
The DataCite Event Data Query API by default returns 1000 events per query. This number can be adjusted by the page[size]
query parameter, and must be between 0 and 1,000. For page[size]=0
only the meta
object is returned.
To paginate through up to 10,000 results, the page[number]
query parameter can be used.
For query results with more than 1,000 events, e.g. to harvest all events from a particular source or for a particular prefix, cursor-based pagination should be used. The DataCite Event Data Query API returns a links
JSON object, use the URL in the next
property to fetch the next page, and this URL includes a page[cursor]
query parameter. With cursor-based pagination all events are retrieved in chronological order (using the timestamp
property), oldest events first. The cursor used is the UNIX epoch time, i.e. the number of seconds passed since 1 January 1970.
Sorting
By default all events are sorted in ascending chronological order (using the last updated timestamp). Other sort criteria are:
- relevance: the relevance score of the query
- obj-id: the obj-id of each event
- total: total count of each event. Is > 1 only for usage events.
- created: using the timestamp when the event was created in the DataCite Event Data Query API
With the exception of relevance
, events can be sorted in descending order by prefixing the sort parameter with a minus sign, e.g. -total
. When using a sort parameter, only the first 10,000 events can be retrieved, as pagination based on page number is used.
Statistics
The DataCite Event Data Query API returns statistics in a meta
JSON object, with the following properties:
- total: the total number of events found for this query (the API only shows 1,000 events at a time)
- total-pages: the number of API calls needed to return all results (fetching 1,000 events at a time)
- sources: the sources for the events found in this query, and the number of events per source
- prefixes: the DOI prefixes for the events found in this query, and the number of events per prefix, for up to 50 prefixes
- relation types: the relation types for the events found in this query, and the number of events per relation type, further broken down by year and month
Updated over 1 year ago