"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

CGI Identifiers

See also:



Introduction

Web links and identifiers

A web-based service-architecture depends on being able to link or otherwise make reference to resources. Identifiers for resources are, therefore, a primary requirement.

Resources of interest include:
  1. service end-points: http, ftp services, etc
  2. resources that define service behaviour: specifications, definitions, etc
  3. resources available from the end points: data objects, etc

URI's are the primary form of identifier used within the web context.

Links embedded in GML-encoded datasets are required to be expressed as URI's. Service end-points are usually identified by a network location, expressed as an IP number or URL. There are a variety of options commonly used to identify content.

For the purposes of interoperability of services operating under CGI governance arrangements, a URI scheme will be defined for use in identifying resources concerned with the system description (schemas, vocabularies, feature-types, etc). The URI scheme is optionally available to identify some other resources (e.g. data)

URN vs URI

URN

URN's are primarily suitable for identifying stable resources that are concerned with system definition, which are created and assigned manually.

The primary reasons for selecting URN are
  1. URN's are URI's, so a URN may appear as the value of an @xlink:href in a GML-conformant document
  2. URN's are persistent
  3. A URN may identify offline- as well as online-resources
  4. Using a URN does not imply immediate resolvability, which may be a Good Thing in some circumstances
  5. URN structure is "facetted". This enables more flexible rules for identifier governance - compared with UUID's, Handles and DOI's, which have only one or two significant fields; or standard URL structure, which is hierarchical.
  6. URN's are composed of colon-delimited alphanumeric fields. This allows some explicit semantics to be visible, which may imply resource-type, ownership, even value. This is often useful during system development.
  7. OGC has adopted URN to identify artefacts in its service architecture

Special characters

The lexical constraints on URNs are described in RFC 2141 2.2 and 2.3. In particular the following must be hex-encoded:

space character excluded characters
%20 " %22
reserved characters & %26
# %23 < %3C
% %25 > %3E
/ %2F [ %5B
? %3F \ %5C
] %5D
^ %5E
` %60
{ %7B
| %7C
} %7D
~ %7E

When appearing as elements within URLs (e.g. as the argument to a resolver request) additional characters must be escaped according to the restrictions on URIs as described in RFC 3986:

encoder sub-delims
% %25 ! %21
gen-delims $ %24
: (colon) %3A & %26
/ %2F ' (apostrophe) %27
? %3F ( %28
# %23 ) %29
[ %5B * %2A
] %5D + %2B
@ %40 , (comma) %2C
; (semi-colon) %3B
= %3D

Resolution

A limitation of URN is that there is no on-line resolution service. In the context of CGI services this is acceptable since
  • Most uses of URN references are as "classifiers" and are not expected to trigger a "get" operation. Rather, they will trigger a comparison operation where the identifier acts as a proxy for the resource ("Do I know this identifier?", "Is this identifier the same as that one?")
  • a CGIIdentifierResolver service shall be provided, as an interface to a register of CGI URN's.

See CGIResourceClassRegister#Resolution for further discussion of the limitations of the CGI resolver.

Identifiers and governance of resources

Requesting or assigning a persistent identifier implies either
  1. agreeing to provide a resource in a persistent and predictable manner, or
  2. asserting that a resource of interest is governed in an orderly manner by someone else.

Hence, assignment of names has implications for the governance of artefacts, and vice versa.

More information

For a general discussion of web identifiers, see ResourceIdentifiers.

For an outline of the OGC URN scheme, see OgcURNScheme.

For details of a (proposed) CGI URN resolver, see CGIIdentifierResolver.

The CGI URN scheme

Note that at the Rome IWG meetings (30 Aug to 3 Sept 2010) it was decided to change URI convention to use http URI's as described in CGI http URI scheme. Existing URN registries will be maintained, but the URN's will be mapped to http URIs. Dereferencing to be implemented by the resource.geosciml.org host (to be implemented for GeoSciML v3 services, 2011?). See meeting notes

-- SteveRichard - 2010-09-12

CGI Namespace

RFC 5138 establishes the CGI URN namespace, which appears in the IANA register. The CGI URN provides for a set of identifiers of the form

urn:cgi:{CGIResource}:{ResourceSpecificString}

The value of the {CGIResource} element shall be taken from the register of Resource Classes - see CGIResourceClassRegister.

The structure of the {ResourceSpecificString} for each Resource Class is defined in the register of Resource Classes - see CGIResourceClassRegister.

CGI Naming Authority - CGINA

The allocation of URN's in the CGI namespace is controlled by the CGI Naming Authority (CGINA). CGINA is directly responsible for
  1. maintaining the register of CGIResource classes, which includes
    • defining the structure of the ResourceSpecificString per resource type
    • delegation of authority for specific elements within the ResourceSpecificString
  2. assigning URN's for resources governed by CGI, in which the value of the {owner} field is fixed as "CGI".

Resources governed by CGI

Certain resources are provided by CGI in support of interoperable geoscience data and services. These shall be identified by a URN in the CGI namespace in which the value of the {owner} field is fixed as "CGI".

Resources provided by other organizations

Certain resources that are provided by other organizations require identifiers suitable for use in the context of CGI-conformant services. These include
  • terms and concepts in vocabularies or ontologies whose governance is delegated to that organization
  • data instances that require a persistent identifier to enable time-independent referencing and cross-referencing These may be identified by a URN in the CGI namespace in which the value of the {owner} field is taken from the register of CGIAuthorityRegister, and assigned according to the delegations defined for the various CGIResourceClassRegister.

Examples

urn:cgi:register:CGI:resourceClass
identifies the register of resource classes for which identifiers from the CGI scheme may be provided.
urn:cgi:register:CGI:register
identifies the register of registers owned by CGI.
urn:cgi:registerItem:CGI:resourceOwner:APAT
identifies the item "APAT" within the CGI register of resource owners.
urn:cgi:document:CGI:CGIIdentifierScheme
identifies the document that describes the CGI Identifier Scheme.
urn:cgi:document:CGI:CGIIdentifierScheme:text%2Fhtml
identifies the document that describes the CGI Identifier Scheme expressed in its HTML form.

urn:cgi:xmlns:CGI:GeoSciML:2.0
is the XML namespace for version 2.0 of GeoSciML which is owned by CGI.
urn:cgi:schema:GGIC:MineralOccurences:1.0:XMI
identifies the XMI representation of version 1.0 of the Mineral Occurrences information model owned by the Australasian Government Geologists Information Committee.
urn:cgi:featureType:CGI:GeoSciML:2.0:Contact
identifies the "Contact" feature-type from version 2.0 of the GeoSciML schema that is owned by CGI.
urn:ogc:serviceType:CGI:GSML-FS:1.0
identifies version 1.0 of the "GSML-FS" service-type owned by CGI.
urn:cgi:classifierScheme:ICS:StratChart:2008
identifies the Stratigraphic Chart published by the International Commission for Stratigraphy.
urn:cgi:classifier:ICS:StratChart:2008:Silurian
identifies the geologic period designated "Silurian" within the scheme designated "StratChart" owned by the International Commission for Stratigraphy.
urn:cgi:classifier:GSV:Stratindex:2007:MelbourneFormation
identifies the concept designated "MelbourneFormation" within the scheme designated "Stratindex" by Geoscience Victoria (Australia).
urn:cgi:classifier:CGI:sandstone
identifies the concept designated "sandstone" by CGI
urn:cgi:feature:USGS:2feb49bc-6755-11dc-8314-0800200c9a66
identifies a feature instance designated "2feb49bc-6755-11dc-8314-0800200c9a66" by US Geological Survey.
urn:cgi:feature:CGI:EarthNaturalSurface
The natural surface of the earth. See EarthNaturalSurface.
urn:cgi:object:SGU:552cb080-6755-11dc-8314-0800200c9a66
identifies an object designated "552cb080-6755-11dc-8314-0800200c9a66" by Sveriges geologiska undersokning (Sweden).
urn:cgi:party:CSIRO:cox075
identifies a party designated "cox075" by CSIRO.

-- SimonCox - 20 Sep 2007/07 Nov 2007

Usage

Object and Feature identifiers

Attach an identifier to a GML-encoded feature or object using gml:identifier (GML 3.2 and later) or gml:name (GML 3.1 and earlier) using the following pattern:
<gml:identifier codeSpace="http://www.cgi-iugs.org/uri ">urn:cgi:feature:USGS:2feb49bc-6755-11dc-8314-0800200c9a66</gml:identifier>


<gml:name codeSpace="http://www.cgi-iugs.org/uri ">urn:cgi:party:GA:LesleyWyborn</gsml:name>

http://www.cgi-iugs.org/uri is the address of the CGI URN resolver, as specified in RFC 5138.

Descriptive terms

The CgiValue scheme provides for a "term value" whose value is a scoped name. If a term has been assigned a URN in the CGI namespace, then it may be used within an XML encoded instance using the following pattern:

<gsml:CGI_TermValue qualifier="equalTo">
    <gsml:value codeSpace="http://www.cgi-iugs.org/uri ">urn:cgi:classifier:ICS:StratChart:2008:Silurian</gsml:value>
</gsml:CGI_TermValue>

Again CGI_TermValue/value/@codeSpace denotes the basic URN scheme. We can also tell from the URN both the value and the classification scheme to which it belongs.

-- SimonCox - 17 Oct 2007

This ought to work but doesn't CGI URI Resolver:
Topic revision: r57 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).