"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

GML Identifiers Discussion board

Related pages

Contents

URN vs. URL email discussion track

This part reports some discussion that took place via email and relates to the use of http-URIs (aka URL) versus URN. The discussion was first centred on identifier for concepts in Ontologies.

Friday 20/11/2009

Another 'heads-up' - under pressure from various quarters, both INSPIRE and OGC are likely to switch from URNs to http-URIs (formerly known as URLs, and commonly referred to as just URIs by the semantics web mafia) for persistent identifiers.

The arguments pro and con are legion, but the battle is largely over.

The key thing to realise is that, even though moving to http URIs removes the resolver problem, it does not remove the governance issue. In fact it pushes back on the identifier-creator an implicit requirement to maintain a http service at the address use for as long as the identifier is expected to be valid. In the case of system specifications (schemas, feature-types, classifiers) this is effectively 'in perpetuity'.

So make sure that the identifier that you choose cannot be hi-jacked, and if you can't insulate yourself from the whims of some higher power (e.g. a government department - e.g. by choosing a domain name that does not depend on this years departmental acronym) that you get a rock-solid agreement from them that you can keep the domain alive forever.

SimonCox

Thanks for the heads up Simon. I too have felt the attraction of http as a quick fix for this problem...

how much retooling of our existing URN's is necessary--can we just keep them and add an http:something.appropriate.xxx/ prefix?

Maybe CGI can set up a registry for identifiers. Can we make it permanent enough....? Is anyone exploring modifying DNS software for this kind of function?

SteveRichards

Saturday 21/11/2009

  • 1. strip urn:cgi:
  • 2. replace with http://xxx.yyy.zzz/
  • [3. cast to lower-case]
  • 4. substitute : with /

SimonCox

Monday 23/11/2009

For a mapping table: urn owl:sameAs url. For software: its essentially just Apache configuration.

Key point is para 3 sentence 1:

though moving to http URIs removes the resolver problem, it does not remove the governance issue.

SimonCox

Simon, thanks for the INSPIRE/OGC updates and your recommendations.

in case we want to generate mapping tables for the RDF/SKOS files we already generated I think we should use: urn skos:exactMatch URL, instead of owl:sameAs. Otherwise, for ontologies that are not 'in use' yet (should be fine for the AuScope and GeoSciML ones), it seems pretty easy to replace the urn by urls in the main file.

GuillaumeDuclaux

The semantics are different: skos:exactMatch says 'these two concepts defined in different vocabularies are describing the same thing" owl:sameAs says 'these two URIs identify the same resource'.

For a mapping table, we want the second, I think.

There was a big thread on the SKOS mailing list recently, buried within which I extracted these nuggets.

I agree that it makes some sense to just replace the URIs in the triples, but you may want to add the owl:sameAs statements for traceability, particularly if any of the URN sets are already in use.

SimonCox

Tuesday 24/11/2009

Simon, Care to post your comments to the GeoSciML mailing list? I'm assuming the INSPIRE/OGC change to http-URIs will require GeoSciML to be modified, and probably invalidates all of Eric's work. Personally I'd rather see the use of URNs to identify the name of the object/feature and URLs to identify the location of the object/feature.

But the change is consistent with the Australian NCRIS use of 'handles'.

BruceSimons

Hi Gilly, I don't believe we can change until the GeoSciML community agrees to do so. At this stage this is the first I've heard that the OGC are moving this way. Eric currently has a discussion document debating URN vs URL that he has circulated to some of the GeoSciML team, so we (AuScope) have to wait for the GeoSciML Technical Task Group to make the decision.

My previous email to Simon was an attempt to get this raised to that level. Unfortunately this has all been happening on a variety of email threads instead of being discussed on TWiki.

BruceSimons

I took care of listing pros and cons, especially regarding URN vs URL debate. So more argument won't invalidate all my work if issues can be resolved.

I raised a couple of concerns regarding using URL for identity because it binds the resolution mechanism to the indentity of the vocabulary in cases where the location of the vocabulary is not the main concern.

> In fact it pushes back on the identifier-creator an implicit requirement to
maintain a http service at the address use for as long as the identifier is expected to be valid.

oh.. there's more into it..URL do not have rigid syntax.

Anyway, I was not part of the debate, maybe I'm just missing convincing arguments.

EricBoisvert

Hi Gilly (and Simon - this is for your info/edification)

I'm setting up the development EarthResourceML service using http-URIs as I promised (threatened?) this morning. It has been easy to set up the URIs for the GSV features (technically) but ...

One outstanding issue I have is URIs for the commodityName ScopedName (see the bold line in the sample er:Commodity below). I assume AuScope's implementation of the GEMET API on its vocab services means you have these. I'd be grateful if you could give them to me, or the rules for converting a URN to a http-URI.

By the end of tomorrow we should have a working WFS running and also have Apache configured to convert the URIs identifying features into WFS GetFeature requests that return a fragment (not the full WFS feature collection) containing the feature. Watch this space.

<er:Commodity gml:id="earthresourceml.1.1.commodity.370000.au" 
              xmlns:er="urn:cgi:xmlns:GGIC:EarthResource:1.1" xmlns:gml="http://www.opengis.net/gml" 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" 
              xsi:schemaLocation="urn:cgi:xmlns:GGIC:EarthResource:1.1 http://www.earthresourceml.org/earthresourceml/1.1/xsd/mineralOccurrence.xsd"> 
        <gml:name codeSpace="http://www.cgi-iugs.org/uri">http://some.server/some/feature/id</gml:name> 
        <er:commodityName codeSpace="some:uri">http://some.server/some/concept/id</er:commodityName>
        <er:commodityImportance>major</er:commodityImportance> 
        <er:source xlink:href="http://some.server/some/feature/id"/> 
</er:Commodity> 

AlistairRitchie

Alistair,

I'm sitting with Jackie and we had a look at your URL request.

Would this work for you, you've got 2 options: retrieving concepts by Label: eg Gold --> http://auscope-services-test.arrc.csiro.au/vocab-service/testGilly/getConceptByLabel?label=Gold or by URI --> something like: http://auscope-services-test.arrc.csiro.au/vocab-service/testGilly/getConceptByURI?concept_uri=urn:cgi:classifier:GA:commodity:Au (This doesn't work yet but will very soon!) In the second case, you should be able to pass a URL instead of URN.

Jackie just need to fix a tiny bug with special characters encoding in the GEMET API formatting, then you'll be able to query the commodity_vocab repo

Hope that helps and that we answered your question...

GuillaumeDuclaux

Not quite. My understanding of the switch from URNs to URIs is that the concepts will need to be identified with URIs so urn:cgi:classifier:GA:commodity:Au as a concept ID is out. The concept id, and references to the concept will have the same value, say:

http://auscope-services.arrc.csiro.au/vocab-service/concept/commodity/au*

I thought maybe the GEMET RESTful interface would provide the structure for this URL. Making a few mental jumps too many. Sorry.

* Obviously there's a lot wrong with this URL as a persistent ID, but that's another discussion.

AlistairRitchie

The GEMET API is not the whole RESTful story. The bit that is missing (because it doesn't really need to be stated explicitly) is that if the concept identifier is a URL, then 'getConceptByID' is merely a http GET with the URL as the argument.

i.e. http://auscope-services-test.arrc.csiro.au/vocab-service/testGilly/getConceptByURI?concept_uri=urn:cgi:classifier:GA:commodity:Au

becomes just

http://classifier.cgi-iugs.org/GA/commodity/Au or http://www.cgi-iugs.org/classifier/GA/commodity/Au

(the URN->URL mappings are hypothetical - but note that it should still contain all the elements that were deemed necessary in the URN scheme, i.e. including 'GA').

SimonCox

new version of the document (should I attach it to the twiki page ?)

I added one option (the "Salomon" option ?)

5.2.7 Resolver is explicitly referenced using a URL

This option is a mixture of 5.2.4 and full URL to the resource. As far as the client is concerned, it is a full URL approach. The twist is that URLs are pointing to a resolver (or a limited set of resolvers). It pretty much solves the service artifacts and XPointer problems and WFS gml :id versus URN issues because the URN is used as an argument. If a community can agree on a resolver API syntax, this could also address some of the URL variability issues. It finally shields API variations because only the resolver API is used.

Example:
<property codeSpace="http://ngwd-bdnes.cits.nrcan.gc.ca/service/gin/resolver?URN=urn:x-ngwd:vocab">Term</property>
Or
<property xlink:href="http://ngwd-bdnes.cits.nrcan.gc.ca/service/gin/resolver?URN=urn:x-ngwd:feature:WaterWell:6535"/>

EricBosivert

I just want to check if I got this correctly.

When codeSpace is : codeSpace="http://www.cgi-iugs.org/uri" and when the property is gml:name or gml:identifier, this means that the value of the property is the unique identifier of the current object. (see GeologicUnit in the example below).

But wthen it's any other property (gsml:value example below), it is a pointer to an object that is owned (governed) by OGC ?

Example from: http://www.geosciml.org/geosciml/2.0/examples/TB3.1_UC3D_BGS_MappedInterval_Borehole.xml

<gsml:GeologicUnit> 
        <gml:name codeSpace="http://www.cgi-iugs.org/uri">urn:cgi:classifier:BGS:StratigraphicLexicon:MADE_GROUND</gml:name>

        <gsml:observationMethod> 
        <gsml:CGI_TermValue> 
                <gsml:value codeSpace="http://www.cgi-iugs.org/uri">urn:cgi:classifier:CGI:FeatureObservationMethod:2008:drill_core_observation</gsml:value>

        </gsml:CGI_TermValue> 
        </gsml:observationMethod> 
        
… 
</gsml:GeologicUnit> 

EricBosivert

Wednesday 25/11/2009

Hi Eric,

I’m not sure that I understand your question. When you say “owned (governed) by OGC”, do you mean “owned (governed) by CGI” ?

As I understood it, codeSpace="http://www.cgi-iugs.org/uri" points to a (now broken) URL to a resolver service? I can’t find any record in the Quebec discussions (it says “Take this off-line and resolve through the Service Architect Task Group”) but I seem to remember that we preferred that the codespace should point to the vocabulary/lexicon/dictionary that the value comes from (as suggested in the GML spec - “where the value of the codeSpace attribute (if present) shall indicate a dictionary, thesaurus, classification scheme, authority, or pattern for the term”), rather than acting as a pointer to a URN resolver. The many examples of codespace provided in the GML spec are all URLs to dictionaries of terms.

In your examples, I think the codespace for GeologicUnit name should be a URI from the BGS (ie, the organisation delivering the unit) like http://www.bgs.gov.uk/vocabulary/stratigraphic_lexicon.html. The codespace for the ObservationMethod could be a URI pointing to either a registered CGI vocabulary or a BGS vocabulary, depending on what the BGS wanted to deliver.

OliverRaymond

> you mean “owned (governed) by CGI” ?

yes, sorry, acronym poisonning again.

> As I understood it, codeSpace="http://www.cgi-iugs.org/uri" points to a (now broken) URL to a resolver service?

it's unclear from Simon's note. it's either, "it is a unique identifier, which is by the way the URL of the resolver" or "codeSpace must contain the URL of the resolver". The first case is pretty much like namespaces. They often use URL as identifiers, but not required to point to anything.

>but I seem to remember that we preferred that the codespace should point to the vocabulary/lexicon/dictionary that the value comes from (as suggested in the GML spec - “where the value of the codeSpace attribute (if present) shall indicate a dictionary, thesaurus, classification scheme, authority, or pattern for the term”), rather than acting as a pointer to a URN resolver. The many examples of codespace provided in the GML spec are all URLs to dictionaries of terms.

We'll, this is the essence of the whole debate isn't it ? > I think the codespace for GeologicUnit name should be a URI from the BGS

Are you sure ?. I was pretty convinced that codeSpace="http://www.cgi-iugs.org/uri" meant "this is a global unique ID" until we move to gml:identifier. it used to be "ietf:rfc:2141" or something like this.

EricBosivert

From Alistair's example:

<gml:name codeSpace=&#34;http://www.cgi-iugs.org/uri">http://gsv-ws.dpi.vic.gov.au/dev/earthresourceml/1.1/commodity/370000/au</gml:name>
ok, now you lost me. Is this the way to encode the identity of the feature ?? (God.. so many "/"). It this URL supposed to point to something ? If yes, isn't it circular ?

-- EricBoisvert - 2009-11-25

Essentially http://gsv-ws.dpi.vic.gov.au/dev/earthresourceml/1.1 is the 'resolver' URL and commodity/370000/au is the identifier. This is a first cut. I'm working towards http-URIs formatted as http://gsv-ws.dpi.vic.gov.au/dev/feature/gsv/... -- AlistairRitchie - 2009-11-26


EricBoivert attached the following summary to his last email:

Why do you need a resolver for a feature that is already resolved ?

-- EricBoisvert - 2009-11-27
 
Topic attachments
I Attachment Action Size Date Who Comment
Urn_Resolution.docdoc Urn_Resolution.doc manage 346.0 K 25 Nov 2009 - 09:04 GuillaumeDuclaux Implemention of a URN resolution mechanism, provided by Eric Boivert.
Topic revision: r5 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).