DataCite Support

DataCite Support

Welcome to DataCite's support site. Here you will find helpful information about our services. We have included comprehensive technical guides, training materials, and tips to help you successfully use DataCite services.

Get Started    API Reference

DataCite GraphQL API Guide

Introduction

The DataCite GraphQL API support queries of the DataCite API using the GraphQL query language. The pre-release version of the API was launched in May 2019, with an official release of the DataCite GraphQL API expected before the end of the year. The API endpoint is https://api.datacite.org/graphql.

What is GraphQL?

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

Source: GraphQL community website

GraphQL is not REST, so many functionalities common to many APIs are not supported, in particular, the use of HTTP verbs and resource paths. All GraphQL API calls are POST requests to https://api.datacite.org/graphql, and the query is defined in the body of the POST.

The GraphQL API is an alternative to the DataCite REST API, and currently only supports queries, but not mutations (create or update actions, e.g. registering a new DOI).

DataCite GraphQL API Pre-Release

The GraphQL API launched in May 2019 is a pre-release version. This means that the functionality will change in the coming months, including support for additional fields, performance improvements, and bug fixes. Please post a message in the PID Graph category of the PID Forum if you want to report a bug or suggest a feature.

The PID Graph

DataCite DOIs describe resources such as datasets, samples, software and publications with rich metadata. An important part of this metadata is the description of connections between resources that use persistent identifiers (PIDs) provided by DataCite and others (Crossref, ORCID, ROR, ISNI, IGSN, etc.). Together these resources and their connections form a graph, the PID Graph:

Using GraphQL to query the PID Graph

The REST APIs that most PID service providers, including DataCite, use to expose metadata about PIDs are a good fit to describe a single resource, e.g. a dataset, and show the connections to other resources (e.g. the authors of a dataset) by including the PIDs for those linked resources. REST APIs are not a good fit for complex queries of the PID Graph, and GraphQL is the better fit for these kinds of queries. GraphQL has these important features:

  • Specify the fields and connections that should be included in the query result, including nested connections that traverse the PID Graph.
  • Supports queries of external resources, e.g. information by other PID providers
  • Schema that describes and enforces the queries that are possible
  • Rich set of developer tools and supporting libraries

GraphQL clients

Because the GraphQL query interfaces are standardized and described in a schema, any GraphQL client application can automatically work with any GraphQL API. This also includes built-in documentation and auto-complete functionality when constructing queries:

Graphiql is a popular GraphQL client and is available as a library to include in other applications, or as a desktop application. You can also use web-hosted GraphQL clients, e.g. here.

GraphQL uses a special query language that resembles JSON. For example:

{
  funder(id: "https://doi.org/10.13039/501100000780") {
    name
    alternateName
    datasets(first: 10, after: "Mg") {
      edges {
        relationType
        source
        cursor
        node {
          id
          titles {
            title
          }
          relatedIdentifiers {
            relatedIdentifier
            relationType
          }
          fundingReferences {
            awardTitle
            awardNumber
          }
        }
      }
    }
  }
}

The built-in documentation shows the available fields, not only helping with the construction of a query, but also validating the input. For this reason, we aren't listing the available fields in the support documentation.

The API response is normal JSON, following exactly the structure of the query.

Resources available in the DataCite GraphQL API

You can query the GraphQL API for the following resources:

  • Providers
  • Clients
  • Prefixes
  • DOIs
  • Researchers (using the ORCID API)
  • Funders (using the Crossref Funder ID API)
  • Organizations (using the ROR API)

When querying DOIs you must specify the resourceTypeGeneral in the request, rather than querying all DOIs at large. For example, the query below specifies that it is looking for a dataset:

{ 
  dataset(id: "https://doi.org/10.7910/dvn/nfzli3/cynkam") {
    titles {
      title
    }
    publicationYear
    publisher
  }
 }

There is one exception: use publication for resourceTypeGeneral text.

You can either fetch information about a single resource using the PID, or do a query for multiple resources. The dataset query example above is fetching information about a single resource, as is the researcher query example below.

{
  researcher(id: "https://orcid.org/0000-0003-1419-2405") {
    id
    name
  }
}

The publications query below demonstrates how to do a query for multiple resources. In this case, we are fetching information about all items with a resourceTypeGeneral of text (because we're specifying publications) that contain the word "climate".

{
  publications(query: "climate") {
    totalCount

    nodes {
      id
      titles {
        title
      }
      descriptions {
        description
      }
      creators {
        name
        familyName
      }
      fundingReferences {
        funderIdentifier
        funderName
        awardTitle
        awardNumber
      }
    }
  }
}

Queries for researchers by anything other than id are not yet supported, as the ORCID API that is used for that query currently only returns the ORCID ID in the query results.

Queries support the totalCount field and return results under the nodes field (see example above). For queries using the Event Data service, results are returned under an edges field, optionally returning the meta information contained in Event Data (e.g. source or relationType), and then the related resources (datasets in the example below) under a node field.

{
  funder(id: "https://doi.org/10.13039/501100000780") {
    name
    alternateName
    datasets(first: 10, after: "Mg") {
      edges {
        relationType
        source
        cursor
        node {
          id
          titles {
            title
          }
          relatedIdentifiers {
            relatedIdentifier
            relationType
          }
          fundingReferences {
            awardTitle
            awardNumber
          }
        }
      }
    }
  }
}

Developing applications using the DataCite GraphQL API

Using a GraphQL client to explore what queries are supported in the DataCite GraphQL API, as described above, is a good starting point. To then develop an application using the DataCite GraphQL API we recommend picking a GraphQL library for the language you will be using, starting from this list.

Please post a message to the PID Graph category of the PID Forum if you have any questions regarding the DataCite GraphQL API.

DataCite GraphQL API Guide


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.