Copyright © 2019-2020 the Contributors to the Dataset API Discovery 0.3 Specification, published by the OpenActive Community Group under the W3C Community Contributor License Agreement (CLA). A human-readable summary is available.
This document specifies a dataset site and embedded JSON-LD document that together describe an open data dataset and define related APIs that are available to manipulate it.
This specification was published by the OpenActive Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.
Contributions to this document are not only welcomed but are actively solicited and should be made via GitHub Issues and pull requests. The source code is available on GitHub.
This document represents an early Editors Draft of the planned API design. It is likely to change between now and its final version. These early drafts are intended to help developers provide that feedback by developing proof-of-concept implementations. We encourage developers to explore this API and contribute to the development of the specification.
If you wish to make comments regarding this document, please send them to [email protected] (subscribe, archives).
This section is non-normative.
The document is an output of the OpenActive Community Group. As part of the OpenActive initiative, the community group is developing standards that will promote the publication and use of open opportunity data in helping people to become more physically active.
This specification aims to build on existing work by the WebAPI Discovery Community Group, the W3C's DCAT standard, and the Schema.org discussion, providing a profile and guidance specifically both for open data publishers and implementers of Web APIs that manipulate openly available datasets. It also aims to provide conformance rules to ensure that implementers include the details necessary to allow a Data Consumer to reliably integrate with standards-compliant services without human intervention.
The specification defines both the requirements of a Dataset Site provided by a Data Publisher (server) for use by a Data Consumer (client), and of any Data Catalog designed to enable discovery of such sites. In addition, it includes high level requirements for human-readable content, and detailed requirements and conformance rules for machine-readable content.
Dataset Sites that conform to this specification will be:
WebAPI
Search.Dataset Sites will also provide the following information about the implementation of APIs it describes:
Data Catalogs published in accordance with this standard will be
Note that although this specification of the OpenActive Community Group, it is designed to apply to any open dataset where an API is available to manipulate it.
By design this specification will not define some types of functionality.
These have been declared as permanently out of scope because they are adequately covered by existing specifications:
The document is primarily intended for the following audiences:
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST, OPTIONAL, RECOMMENDED, and REQUIRED in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
This specification makes use of the compact IRI Syntax; please refer to the Compact IRIs from [JSON-LD].
The following typographic conventions are used in this specification:
markup
Notes are in light green boxes with a green left border and with a "Note" header in green. Notes are normative or informative depending on the whether they are in a normative or informative section, respectively.
Examples are in light khaki boxes, with khaki left border, and with a
numbered "Example" header in khaki. Examples are always informative.
The content of the example is in monospace font and may be syntax colored.
A Dataset Site is a human and machine readable web page ("Dataset Page") that describes a dataset and the APIs available to interact with it, with associated functionality that allows for feedback to be provided about the dataset.
A Data Catalog is a JSON structure that supports and enables the discoverability of Dataset Sites
. They do so by providing metadata and links, either to Dataset Sites
directly or to other Data Catalogs
.
The purpose of a Dataset Site is to provide:
The purpose of a Data Catalog is to provide:
Data Catalogs
where appropriateWith the exception of licensing information, there are no strong requirements for the human-readable content of dataset pages, and implementers may provide whatever information they see fit here. For the convenience of end-users, however, it is normally expected (and is RECOMMENDED) that a Dataset Page
will provide at least the following information and markup:
SessionSeries
, Slots
)Note that, of the list above, only licensing information is REQUIRED to be available in human-readable form, and this license MUST be a Creative Commons Attribution 4.0 International License (often abbreviated as 'cc-by').
Note further that, in the event that you are republishing OpenActive data from another source, the original publisher must be credited as per the terms of this license.
Dataset Sites must be machine-readable via embedded JSON-LD.
Property | Status | Type | Notes |
---|---|---|---|
@context |
REQUIRED | Array of URL values | Note that, in conformity with RFC3986, trailing slashes MUST be supplied. |
@type |
REQUIRED | Text | Dataset |
@id |
REQUIRED | URL | A URL uniquely identifying the dataset site resource. May be the URL of the Dataset Site itself |
schema:url |
REQUIRED | URL | Typically the URL of the dataset site itself. |
schema:name |
REQUIRED | Text | The name of the collection of datasets referenced by the site. Often this will simply be the name of the publishing organisation. |
schema:description |
RECOMMENDED | Text | A human-readable description of the datasets referenced by the site. |
schema:keywords |
OPTIONAL | Array of Text | Short descriptive metadata tags for the dataset collection. |
schema:license |
REQUIRED | URL | A URL reference to the license under which the dataset site is published. For OpenActive dataset sites this should be https://creativecommons.org/licenses/by/4.0/ . |
schema:distribution |
REQUIRED | Array of dcat:Distribution object |
See below, Describing Individual Feeds |
schema:discussionUrl |
RECOMMENDED | URL | A link to a resource for discussing and raising issues with the published datasets. Typically, although not necessarily, this will be a link to a GitHub repository. |
schema:documentation |
RECOMMENDED | Array of URL | Link(s) to further resources concerning the dataset site and its referenced datasets - e.g., GitHub READMEs or status summaries. |
schema:inLanguage |
RECOMMENDED | String | The language of the dataset. Should be expressed as an ISO 639-2 language code. |
schema:publisher |
REQUIRED | schema:Organization | The organization responsible for publishing the collection of datasets linked to by the dataset site. For further information, see below, Describing Organizations. |
schema:datePublished |
REQUIRED | schema:Date | The date the dataset site was published. |
schema:schemaVersion |
REQUIRED | URL | The version of the dataset site specification to which the site conforms. |
The MIME-type of this JSON object MUST be defined on the enclosing HTML script
tag as application/ld+json
.
It is common practice is to reference https://schema.org without a trailing / within @context. However to be consistent with the OpenActive Modelling Opportunity Data specification, which uses the full URI of https://openactive.io/ (including a path as per RFC 3986, the specification requires the schema.org context to be referenced with a trailing slash, i.e. https://schema.org/.
dcat:Distribution
objects)Property | Status | Type | Notes |
---|---|---|---|
@type |
REQUIRED | Text | DataDownload |
schema:name |
REQUIRED | Text | A human-readable name for the dataset. |
schema:additionalType |
RECOMMENDED | URL | A link to a definition of the type of the feed - e.g of ScheduledSessions or CourseInstances |
schema:encodingFormat |
RECOMMENDED | Text or URL | The MIME-type of the data accessible via the contentUrl |
schema:contentUrl |
REQUIRED | URL | The URL of the feed containing the dataset. |
schema:totalItems |
RECOMMENDED | Integer | The total number of items (whether updated or deleted ) that are available from the beginning of the feed. Note that this number will often be approximate only, given the rapidity with which updates may be made to backend datastores. |
schema:WebAPI
)In addition to the above markup for discoverability, dataset sites that support Open Booking API functionality MUST indicate this with markup enabling discovery and use of the relevant API endpoints.
Property | Status | Type | Notes |
---|---|---|---|
@type |
REQUIRED | Text | WebAPI |
schema:name |
RECOMMENDED | Text | A human-readable name for the dataset. |
schema:description |
OPTIONAL | Text | A human-readable description of the API |
schema:documentation |
RECOMMENDED | URL or schema:CreativeWork | Human-readable API documentation. See Describing API Endpoints, below. |
schema:termsOfService |
REQUIRED | Text or URL | Human-readable terms of service documentation. |
schema:provider |
REQUIRED | schema:Organization | The Organization providing the API endpoint. |
schema:endpointUrl |
REQUIRED | URL | The root location or primary endpoint of the API. |
schema:conformsTo |
RECOMMENDED | URL | The URL reference of an established standard to which the described API conforms. |
schema:license |
REQUIRED | URL | A URL reference to the license under which the dataset site is published. For OpenActive dataset sites this should be https://creativecommons.org/licenses/by/4.0/ . |
schema:endpointDescription |
RECOMMENDED | schema:EntryPoint |
A machine-readable description of the API. See Describing API Endpoints, below |
schema:bookingService |
RECOMMENDED | schema:SoftwareApplication |
The software system responsible for handling booking over the Open Booking API. |
oa:authenticationAuthority |
schema:URL |
The location of the OpenID Provider or other relevant authentication authority that must be used to access the API. | e.g. https://auth.bookingsystem.com |
schema:EntryPoint
)Supporting documentation is crucial for the successful uptake and use of APIs. Ideally, both human-readable freetext and machine-readable structured data are made available.
The schema.org objects for human- and machine-readable documents are largely identical in terms of content and structure. However, the MIME-type associated with each will normally differ.
Property | Status | Type | Notes |
---|---|---|---|
@type |
REQUIRED | Text | EntryPoint |
schema:url |
REQUIRED | URL | A URL pointing to supporting documentation for the API. |
schema:encodingFormat |
RECOMMENDED | Text | The MIME-type delivered by the url. For human-readable documentation (schema:documentation ) this will normally be text/html ; for machine-readable documentation (schema:endpointUrl ), application/json or a more-specific subtype of this. |
schema:SoftwareApplication
)Property | Status | Type | Notes |
---|---|---|---|
@type |
REQUIRED | Text | SoftwareApplication |
schema:name |
REQUIRED | Text | The name of the software application |
schema:url |
OPTIONAL | schema:URL |
The URL of a human-readable web-page providing further information about the software. |
schema:featureList |
RECOMMENDED | schema:URL |
A URL pointing to a machine-readable description of the Open Booking API features implemented by the system, e.g. as generated by the OpenActive Test Suite. |
schema:softwareVersion |
RECOMMENDED | Text | Version of the software instance. |
The schema:WebAPI
specification has been assigned Pending status by the schema.org organisation, and is scheduled for release in schema version 10.0. While schema:WebAPI
is relatively stable, then, points of detail are still subject to review and this specification may change at short notice.
The below illustrates a Dataset Site pointing to feeds consisting of ScheduledSessions
, SessionSeries
, and Events
. As the presence of the webAPI
attribute indicates, data items from these feeds are bookable.
<script type="application/ld+json`/">
{
"@context":[
"https://schema.org/",
"https://openactive.io/",
"https://openactive.io/ns-beta"
],
"@type":"Dataset",
"@id":"https://data.example.com/",
"name":"Example Sessions and Events",
"description":"Near real-time availability and rich descriptions relating to sessions and events available from Example.com",
"url":"https://data.example.com/",
"dateModified":"2019-08-25T11:23:27+00:00",
"keywords":[
"Courses",
"Sessions",
"Events",
"Activities",
"Sports",
"Physical Activity",
"OpenActive"
],
"schemaVersion":"https://www.openactive.io/modelling-opportunity-data/2.0/",
"license":"https://creativecommons.org/licenses/by/4.0/",
"publisher":{
"@type":"Organization",
"name":"Example.com",
"description":"Example.com makes it easy to get active!",
"url":"https://example.com/home",
"legalName":"Example Ltd",
"logo":{
"@type":"ImageObject",
"url":"https://cdn.example.com/assets/logo.png"
},
"email":"[email protected]"
},
"discussionUrl":"https://github.com/example/repo/issues",
"datePublished":"2019-07-11T00:00:00+00:00",
"inLanguage":[
"en-GB"
],
"distribution":[
{
"@type":"DataDownload",
"name":"ScheduledSession",
"additionalType":"https://openactive.io/ScheduledSession",
"encodingFormat":"application/vnd.openactive.rpde+json; version=1",
"contentUrl":"https://example.com/api/openactive/scheduledsessions",
"totalItems": 1852
},
{
"@type":"DataDownload",
"name":"SessionSeries",
"additionalType":"https://openactive.io/SessionSeries",
"encodingFormat":"application/vnd.openactive.rpde+json; version=1",
"contentUrl":"https://example.com/api/openactive/sessionseries",
"totalItems": 361
},
{
"@type":"DataDownload",
"name":"Event",
"additionalType":"https://schema.org/Event",
"encodingFormat":"application/vnd.openactive.rpde+json; version=1",
"contentUrl":"https://example.com/api/openactive/events",
"totalItems": 1906
}
],
"backgroundImage":{
"@type":"ImageObject",
"url":"https://cdn.example.com/images/background.jpg"
},
"documentation":"https://developer.openactive.io/",
"accessService":{
"@type":"WebAPI",
"name":"Open Booking API",
"description":"The Open Booking API lets you to book OpenActive Opportunities. The API uses standard schema.org types and is compliant with the JSON-LD specification.",
"documentation":"https://openactive.io/open-booking-api/EditorsDraft",
"termsOfService":"https://example.com/api/booking/documentation/terms-of-service",
"provider": {
"@type": "Organization",
"name":"examplebooking.com",
"description":"examplebooking.com makes it easy to get booking!",
"url":"https://examplebooking.com/home",
"email":"[email protected]"
},
"endpointUrl":"https://example.com/api/booking/",
"conformsTo":[
"https://www.openactive.io/open-booking-api/2.0/"
],
"endpointDescription":"https://www.openactive.io/open-booking-api/2.0/swagger.json",
"bookingService": {
"@type": "SoftwareApplication",
"name": "nyExampleBookingPlatform",
"softwareVersion": "1.2",
"url": "https://www.example.com/myExampleBookingPlatform",
"featureList": "https://www.example.com"
}
}
}
</script>
schema:DataCatalog
)Data Catalogs will normally be published as JSON-LD objects accessible via a URL.
Property | Status | Type | Notes |
---|---|---|---|
@context |
REQUIRED | Array of URL values | Will normally consist only of the value http://schema.org/ . Note that, in conformity with RFC3986, trailing slashes MUST be supplied where appropriate. |
@type |
REQUIRED | String | DataCatalog |
@id |
RECOMMENDED | URL | A unique identifier for the DataCatalog, often identical to the URL at which the DataCatalog is found. |
schema:datePublished |
RECOMMENDED | schema:Date |
The date the DataCatalog was published. |
schema:publisher |
RECOMMENDED | schema:Organization |
The Organization responsible for publishing the DataCatalog. |
schema:license |
REQUIRED | URL | A URL reference to the license under which the dataset site is published. For OpenActive dataset sites this should be https://creativecommons.org/licenses/by/4.0/ . |
schema:dataset |
REQUIRED if hasPart is absent, OPTIONAL otherwise. |
Array of URL | One or more URLs pointing to OpenActive Dataset Sites. |
schema:hasPart |
REQUIRED if dataset is absent, OPTIONAL otherwise. |
Array of URL | One or more URLs pointing to other OpenActive DataCatalogs. |
It is common practice is to reference https://schema.org without a trailing / within @context. However to be consistent with the OpenActive Modelling Opportunity Data specification, which uses the full URI of https://openactive.io/ (including a path as per RFC 3986, the specification requires the schema.org context to be referenced with a trailing slash, i.e. https://schema.org/.
The W3C DCAT 2.0 standard is widely used to publish Data Catalogs. In order to make the semantics of OpenActive Data Catalogs clear, and to assist developers and organisations more familiar with DCAT, a mapping from DCAT 2.0 to OpenActive schema.org-based Data Catalog elements is provided here.
DCAT 2.0 Element | schema.org target element |
---|---|
dcat:issued |
schema:datePublished |
dcat:publisher |
schema:publisher |
dcat:license |
schema:license |
dcat:dataset |
schema:dataset |
dcat:hasPart |
schema:hasPart |
The below is an example of a DataCatalog
JSON object.
{
"@context": " https://schema.org/",
"@type": "DataCatalog",
"id": "https://opendata.example.live/api/datacatalog",
"dataset": [
"https://api.example.org.uk/OpenActive/",
"https://booking.example.co.uk/OpenActive/",
"https://active.example.net/OpenActive/",
"https://camp.example.net/OpenActive/"
],
"datePublished": "2020-10-21T12:28:09.7981681+00:00",
"publisher": {
"type": "Organization",
"name": "Example.com",
"url": "https://www.example.com/systems"
},
"license": "https://creativecommons.org/licenses/by/4.0/"
}
schema:Organization
)Property | Status | Type | Notes |
---|---|---|---|
@type |
REQUIRED | Text | Organization |
schema:name |
RECOMMENDED | Text | The name of the Organization publishing the datasets. |
schema:logo |
OPTIONAL | URL | A link to the publishing Organization 's logo. |
schema:url |
RECOMMENDED | URL | A link to the publishing Organization 's website. |
In the event that a feed is to be removed permanently, publishers MUST:
In the event that all data feeds are to be removed permanently and the publisher is ceasing to publish OpenActive feeds entirely, the dataset site as a whole should be removed and its URL return a 404.
In the event that a consuming application receives a 404 response from a previously-harvested feed URL, all records associated with that feed MUST be purged from its datastore. This is to ensure data privacy and compliance with related legislation, such as e.g. the General Data Protection Regulation (GDPR).
Future iterations of the specification be shaped by the OpenActive community, and we encourage you to get involved.
This section is non-normative.
The editors thank all members of the OpenActive Community Group for their contributions.