Because good research needs good data

General Research Data

Metadata standards

CERIF - Common European Research Information Format

CERIF is the standard that the EU recommends to its member states for recording information about research activity.

Data Package

The Data Package specification is a generic wrapper format for exchanging data, consisting of a folder containing data files and a descriptor file.

DataCite Metadata Schema

A domain-agnostic list of core metadata properties chosen for the accurate and consistent identification of data for citation and retrieval purposes.

DCAT - Data Catalog Vocabulary

DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.

Dublin Core

A basic, domain-agnostic standard which can be easily understood and implemented, and as such is one of the best known and most widely used metadata standards.

OAI-ORE - Open Archives Initiative Object Reuse and Exchange

Defines standards for the description and exchange of aggregations of Web resources.

Observations and Measurements

This standard specifies an XML implementation for the OGC and ISO Observations and Measurements (O&M) conceptual model, including a schema for Sampling Features.

PREMIS

The PREMIS (Preservation Metadata: Implementation Strategies) Data Dictionary defines a set of metadata that most repositories of digital objects would need to record and use in order to preserve those objects over the long term.

PROV

A specification that provides a vocabulary to interchange provenance information.

RDF Data Cube Vocabulary

The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multi-dimensional data sets.

Repository-Developed Metadata Schemas

Some repositories have decided that current standards do not fit their metadata needs, and so have created their own requirements.

Extensions

AGLS Metadata Profile

An application of Dublin Core designed to improve visibility and availability of online resources, originally adapted from the Australian Government Locator Service metadata standard for use in government agencies.

Asset Description Metadata Schema (ADMS)

Used to describe semantic assets, defined as highly reusable metadata (for example: XML schemata, generic data models) and reference data (for example: code lists, taxonomies, dictionaries, vocabularies) that are used for eGovernment system development.

Dryad Metadata Application Profile

An application profile based on the Dublin Core Metadata Initiative Abstract Model, used to describe multi-disciplinary data underlying peer-reviewed scientific and medical literature.

GSIM - Generic Statistical Information Model

A reference framework that provides a common terminology acroos and between statistical organisations; aligns with DDI and SDMX.

OpenAIRE Guidelines for publication repositories, data archives and CRIS systems

The OpenAIRE Guidelines are a suite of application profiles designed to allow research institutions to make their scholarly outputs visible through the OpenAIRE infrastructure. The profiles are based on established standards and designed to be used in conjunction with the OAI-PMH metadata harvesting protocol:

  • The OpenAIRE Guidelines for Literature Repositories are based on Dublin Core;
  • The OpenAIRE Guidelines for Data Archives are based on the DataCite Metadata Schema;
  • The OpenAIRE Guidelines for CRIS Managers is based on CERIF.

While the focus of each profile is different, they allow for interlinking and the contextualization of research artefacts.

Tabular Data Package

A profile of the Data Package specification, intended for exchanging tabular data in CSV (comma-separated values) format.

Tools

CKAN

Tool which utilizes the DCAT standard. CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data.

CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available. Portals that use CKAN include http://data.gov.uk and http://open-data.europa.eu. The United States http://data.gov uses a version of CKAN wrapped up as the Open Government Platform.

Converis

Current research information system implementing the CERIF standard. Originally developed by Avedas but now a product of Thomson Reuters.

Data Package libraries

A collection of libraries for working with Data Packages in various programming languages, and scripts for importing them into databases.

Data Package Validator

The Data Package Validator takes the URL of a Data Package and checks whether it conforms to the Data Package specification.

Data Package Viewer

The Data Package Viewer takes the URL of a Data Package and provides a human-friendly view of it.

Data Packagist

The Data Packagist is a Web-based tool for writing a Data Package descriptor file (datapackage.json).

DataCite Metadata Store API

RESTFUL API for registering datasets with the DataCite organization. The interface uses the DataCite Metadata Schema.

DCMI Tools and Software

The DCMI Tools Community list of tools and software implementing Dublin Core.

DdiEditor

DdiEditor is a DDI-Lifecycle Editing Framework developed by the DDA - Danish Data Archive.

Esri Geoportal Server

Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.

geometa

Geometa is an R package that offers facilities to handle reading and writing of geographic metadata defined with OGC/ISO 19115, 11119 and 19110 geographic information metadata standards, and encoded using the ISO 19139 XML standard. It also includes a facility to check the validity of ISO 19139 XML encoded metadata. The package can be used in integrated (meta)data management flows to generate business metadata compliant with ISO/OGC standards. Metadata generated with geometa can then be published to standard web metadata catalogues by means of related R packages such as ows4R (R interface to OGC Web-Services) or geonapi (R Interface to GeoNetwork API).

Linked Data Cubes Explorer

The Linked Data Cubes Explorer allows for the analysis of statistical datasets using the RDF Cube Vocabulary.

OpenAIRE Validator

This service validates OAI-PMH metadata records against the OpenAIRE Guidelines for publication repositories, data archives and current research information systems.

Pure

Current research information system developed by Elsevier that implements the CERIF standard.

SOS -Sensor Observation Service

This tool uses the Observations and Measurements standard to define a Web service interface which allows querying observations, sensor metadata, as well as representations of observed features.

Symplectic Elements

Current research information system implementing the CERIF standard.

Use Cases

3TU.Datacentrum

A multidisciplinary data repository for a consortium of universities in the Netherlands, using a metadata structure based on the Dublin Core Metadata Initiative.

BAV - Biblioteca Apoltolica Vaticana

The Vatican Library uses FITS as the digital image format for the digitization of its manuscript collection.

Data Packaged Core Datasets

A collection of commonly used and example data sets packaged using the Data Package specification.

Edinburgh DataShare

An online digital repository of multi-disciplinary research datasets produced at the University of Edinburgh, using a modified Dublin Core metadata catalogue.

ePrints Soton

The University of Southampton's multi-disciplinary Institutional Research Repository, using a profile of Dublin Core and administrative ePrints metadata.

List of RDF Data Cube Vocabulary Implementations

W3C Government Linked Data list of implementations of the RDF Data Cube Vocabulary.

National Science Digital Library Data Repository

An online portal for education and research on learning in Science, Technology, Engineering, and Mathematics, using a profile of the Dublin Core Metadata Elements for resource and collections metadata.

Open Archives Inititative

Develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.

OpenAIRE

A European Scholarly Communication Infrastructure that aggregates bibliographic metadata from a network of publication repositories, data archives and CRIS following the OpenAIRE Guidelines. Together with additional authoritative information, the objects and their relationships described by the metadata form an information space graph which can be traversed by users and accessed via APIs by other services. The metadata primarily support discovery and monitoring services.

PROV Implementation Report

A list of the implementations and usage of the PROV specifications.