"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Publishing a Vocabulary as a SISSvoc service

Role: SISSVoc30Manager

Pre-requisites

  1. A vocabulary whose content is clearly structured and for whom the maintenance authority is clear
  2. The domain for URIs to identify the vocabulary content
  3. The means to configure a http server to redirect requests from that domain to another service following a regular-expression pattern
  4. A http service to host a document (file) representation of the vocabulary
  5. An RDF repository that can expose public SPARQL end-points
  6. An LDA service visible at a public http URI

Steps

1. Prepare the vocabulary

This sub-section is divided into two parts, one describing the creation of a new vocabulary and the other describing editing an existing vocabulary. You will only need to follow the subsection relevant to your needs.

As described in the SISSVoc development guide it is recommended that a vocabulary be persisted as an RDF XML document. An RDF XML document can be expected to incrementally change over time as concepts are added/modified/removed to the vocabulary. Unfortunately an RDF document alone is not enough to keep track of recent changes and what previous versions looked like, for this you will need a versioning strategy. A version control system, like Subversion or Git, is more than capable of providing a record of all changes made to an RDF document. The following sections will describe the best practices for managing a RDF document using some form of version control system.

1.1 Creating a new Vocabulary

  1. Liaise with a vocabulary owner to acquire the contents of a vocabulary.
  2. Define a URI pattern for the elements of the vocabulary, preferably in a domain which you either own or have some influence over the owner.
  3. Formalize the vocabulary contents as an RDF document (rdf/xml, ttl, n3, n-triples, whatever) containing SKOS resources and properties
  4. Set the 'Ontology URI' for the document (usually the same as the Base URI) to a URL that can be configured to fetch the document
    • e.g. http://def.seegrid.csiro.au/ontology/geotime/isc-2009
    • N.B. The Ontology URI identifies the Ontology Document. This is notnecessarily the same as the URI for a skos:ConceptScheme.
      • more than one concept scheme can be contained in a single ontology document
      • you may want a request for the concept scheme to return a summary of the concept scheme rather than the whole ontology document.
  5. Include annotations and dependencies for the vocabulary as properties of the ontology, as appropriate, for example:
    <http://def.seegrid.csiro.au/isotc211/iso19156/2011/observation>
          a       owl:Ontology ;
          dc:creator "Simon Jonathan David COX, CSIRO"^^xsd:string ;
          dc:description "An OWL representation of the Observation Schema described in clause 6 of ISO 19156:2011 Geographic Information - Observations and Measurements"^^xsd:string ;
          dc:source "ISO 19156:2011"^^xsd:string ;
          dct:created "2011-07-07"^^xsd:date ;
          dct:modified "2012-07-24"^^xsd:date ;
          owl:imports <http://def.seegrid.csiro.au/isotc211/iso19115/2003/metadata> , <http://def.seegrid.csiro.au/isotc211/iso19123/2005/coverage> , <http://def.seegrid.csiro.au/isotc211/iso19108/2006/temporalobject> ;
          owl:priorVersion <https://www.seegrid.csiro.au/subversion/xmml/ontologies/tags/201205-hash-namespaces/ISOTC211/observation.rdf> ;
          owl:versionIRI <https://www.seegrid.csiro.au/subversion/xmml/ontologies/tags/201207-Toulouse/ISOTC211/observation.rdf> .
  6. Commit the RDF document to your version control system. Continue making changes or proceed to the next step;
  7. Create a tag/branch in the version control system, label it with the new owl:versionIRI property.
  8. [Optional] Configure a webserver so that a HTTP GET request directed at the newly created version IRI will return the RDF document.
Note how owl:versionIRI links to the source document in a version control system, and owl:priorVersion allows users to trace the development of the ontology and refer to earlier versions if necessary.

1.2 Updating an existing Vocabulary

The process for updating a Vocabulary RDF document is relatively straightforward and can be broken down into the following steps:
  1. Make the changes locally and run any relevant tests/reviews. See the tools and documentation section for more info on how to make these changes. Types of tests/reviews are outside the scope of this document;
  2. Commit changes to the version control system. Continue making changes or proceed to the next step;
  3. Once all changes have been made, create a new version name and update the document's owl:versionIRI property. Commit this change to the version control system too.
  4. Create a tag/branch in the version control system, label it with the new owl:versionIRI property.
  5. [Optional] Configure a webserver so that a HTTP GET request directed at the newly created version IRI will return the RDF document.

2. Publish the vocabulary as single document

  1. Load the vocabulary so it can be delivered as a single RDF/XML file from a http server
    • e.g. http://def.seegrid.csiro.au/ontology/geotime/isc-2009.rdf
      • N.B. RDF/XML is the only mandatory syntax that all software supporting OWL is required to read/write. Other syntaxes are available, and may be more suitable for some purposes (e.g. Turtle is more immediately readable and is suitable for text editors) but are not necessarily supported by software.
  2. Verify that Ontology URI gets the file, or configure the http server so that it does

3. Publish the vocabulary at a SPARQL endpoint

  1. Load vocabulary from the file URL into a triple-store that provides a public SPARQL end-point for queries
  2. Verify that the SPARQL endpoint URI works. A proxy-redirect may be used to access the actual SPARQL endpoint URI via a Cool/Persistent URI

4. Publish the vocabulary through a SISSvoc service

  1. Create an LDA configuration for a SISSvoc API for this repositor. This is usually done by clone-and-modify a set of templates:
  2. Get your SISSVoc30Deployer to assist with loading the LDA configuration into Elda
  3. Verify that the vocabulary API is deployed at a suitable cool URI

5. Configure the concept URI server to get concept descriptions from SISSvoc

  1. For each domain that provides http URIs for concepts in the vocabulary, configure the server so that http requests for these URIs are redirected (http 303) to ‘get by URI’ SISSvoc query on a suitable service
At this point the URI domain owner may choose from alternative services if available, for example,
  1. alternative services from the same host might represent multiple versions of a vocabulary
  2. or services from different hosts might provide alternative versions of the same vocabulary

Outcome

Four distinct HTTP services:
  1. HTTP service(s) for the actual URIs used in the vocabulary
  2. SISSvoc API exposing the set of SISSvoc requests as LDA-endpoints,
  3. SPARQL end-point for the vocabulary
  4. The vocabulary as a single RDF file from the ontology URI
There is a lot of proxying and redirection. This is because it is desirable for the public identifiers for the various services to be stable/persistent. The local deployments will commonly include technology-specific elements in the path (e.g. "openrdf-sesame", "elda") which is convenient and appropriate from the internal point of view, but should be hidden by persistent "cool" identifiers for external consumption. -- SimonCox - 10 Feb 2012
Topic revision: r36 - 26 Mar 2013, JoshVote
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).