Landing pages and Google Dataset Search

Google Dataset Search is a search engine specifically for datasets. It relies on exposed crawlable structured data on landing pages via markup, using the Dataset class.

To ensure your repository's datasets are included in Google Dataset Search, we recommended following the guidelines below to embed metadata in your landing pages and ensure that Google can find your landing pages.


DataCite has no control over the processes and time it takes Google to index "Dataset" items in Google Dataset Search.

For more information on exposing your datasets to Google Dataset Search, see Google's help page on the Dataset content type.


Your repository's datasets should appear in Google Dataset Search if:

  1. Landing pages include markup and use the Dataset class.
  2. Landing pages are reachable through navigation or through a sitemap file.

Structured Data

For datasets to show up in Google Dataset Search, a repository must include structured data on each landing page by implementing markup with the Dataset class.

For more information about how to implement markup on repository landing pages please review the for repository landing pages documentation.

To confirm whether landing pages contain the appropriate structured data, use Google’s Structured Data Testing Tool.


Landing pages must use the Dataset class to be included in Google Dataset Search. When using Content Negotiation to generate markup, DOIs must:

  1. Have the Findable state (which is what makes them indexable).
  2. Use Dataset as the resourceTypeGeneral in the metadata you have registered with DataCite. Text items, for example, won't appear in Google Dataset Search.


Using a sitemap file is recommended to help Google find your URLs. Using sitemap files and sameAs markup helps document how dataset descriptions are published throughout your site. More info: