How do I expose my datasets to Google Dataset Search?

Google Dataset Search relies on exposed crawlable structured data via schema.org markup, using the schema.org Dataset class.

While we do our best to enable indexing of DOIs for datasets, DataCite has no control over the processes and time it takes Google to index "Dataset" items in Google Dataset Search. To ensure your repository's datasets are included, we recommended following the guidelines below to embed schema.org metadata in your landing pages and ensure that Google can find your landing pages.

For more information on exposing your datasets to Google Dataset Search, see Google's help page on the Dataset content type.

👍

Your repository's datasets should appear in Google Dataset Search if:

  1. Landing pages include schema.org markup and use the Dataset class.
  2. Landing pages are reachable through navigation or through a sitemap file.

Structured Data

For datasets to show up in Google Dataset Search, a repository must include structured data on each landing page by implementing schema.org markup with the Dataset class.

📘

DataCite Content Negotiation for schema.org metadata

DataCite Content Negotiation can be used to retrieve schema.org metadata to embed in landing pages.

For example, you can include this Javascript file that will return metadata in schema.org markup dynamically through DataCite Content Negotiation. Add a <script> tag with the file to the script to your landing page template. Whenever a landing page is requested, the script will append the appropriately marked up metadata in schema.org markup.

To confirm whether landing pages contain the appropriate structured data, use Google’s Structured Data Testing Tool.

🚧

Landing pages must use the Dataset class to be included in Google Dataset Search. When using Content Negotiation to generate schema.org markup, DOIs must:

  1. Have the Findable state (which is what makes them indexable).
  2. Use Dataset as the resourceTypeGeneral in the metadata you have registered with DataCite. Text items, for example, won't appear in Google Dataset Search.

Sitemaps

Using a sitemap file is recommended to help Google find your URLs. Using sitemap files and sameAs markup helps document how dataset descriptions are published throughout your site. More info: https://developers.google.com/search/docs/data-types/dataset