Landing pages and Google Dataset Search
Google Dataset Search is a search engine specifically for datasets. It relies on exposed crawlable structured data on landing pages via schema.org markup, using the schema.org Dataset class.
To ensure your repository's datasets are included in Google Dataset Search, we recommended following the guidelines below to embed schema.org metadata in your landing pages and ensure that Google can find your landing pages.
DataCite has no control over the processes and time it takes Google to index "Dataset" items in Google Dataset Search.
For more information on exposing your datasets to Google Dataset Search, see Google's help page on the Dataset content type.
Your repository's datasets should appear in Google Dataset Search if:
- Landing pages include schema.org markup and use the
- Landing pages are reachable through navigation or through a sitemap file.
For datasets to show up in Google Dataset Search, a repository must include structured data on each landing page by implementing schema.org markup with the
For more information about how to implement schema.org markup on repository landing pages please review the schema.org for repository landing pages documentation.
To confirm whether landing pages contain the appropriate structured data, use Google’s Structured Data Testing Tool.
Landing pages must use the
Datasetclass to be included in Google Dataset Search. When using Content Negotiation to generate schema.org markup, DOIs must:
- Have the Findable state (which is what makes them indexable).
- Use Dataset as the resourceTypeGeneral in the metadata you have registered with DataCite. Text items, for example, won't appear in Google Dataset Search.
Using a sitemap file is recommended to help Google find your URLs. Using sitemap files and sameAs markup helps document how dataset descriptions are published throughout your site. More info: https://developers.google.com/search/docs/data-types/dataset
Updated 2 months ago