"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

SKOS Encoding for GSML Vocabularies

The GSML Controlled Concept model maps almost exactly onto SKOS. The only piece of the GSML model that does not have a direct SKOS equivalent is the prototype property. However, rdfs:isDefinedBy appears to fill the bill.

I've written a little XSLT script to convert gsml:GeologicVocabulary instances into SKOS. You can compare this GSML Geologic Vocabulary with this SKOS Concept Scheme to see the differences between the gsml:GeologicVocabulary and SKOS encoding of the lithology category vocabulary. (-- SteveRichard - 2009-11-28 The SKOS vocabulary here was actually generated directly from the MS Excel version and has been manually edited to update parent-child links and annotation)

You can also load the latter into Protege to see the concept hierarchy.

-- SimonCox - 28 Oct 2008 -- SimonCox - 06 May 2009

How SKOS implements GeoSciML GeologicVocabulary:

As Simon has pointed out, the GSML Controlled Concept model maps almost exactly onto SKOS. The [[https://www.seegrid.csiro.au/subversion/CGI_CDTGVocabulary/trunk/Documents/Use_SKOS_encodingVocabularies.doc][linked MS word document] is a draft proposal for detailed recommendations on how to use SKOS for vocabulary encoding that will be compatible with the existing tools, and existing (BS8723, NISO Z39-19-2005, I haven't sprung the $$ for ISO2788 or ISO5723) standards and related activities (GEMET, AGROVOC...).

As a starting point, I have built a UML representation of the SKOS conceptual model based on reading of the SKOS Primer and SKOS Reference documents. Views of this model are at UML for SKOS core model, and included in an Enterprise Architect project (SKOS.EAP) file that is in the Subversion repository (also attached to this page if you don't have access to the repository).

The EA project file also includes UML for the GeoSciML GeologicVocabulary and the BS8723 Thesaurus model imported from BS8723 model.xmi. The xmi did not import completely as shown in the jpg version of the uml on the web page, and the UML in the EA model has been fixed up to match the jpg image of the model. Note that to get a copy of BS8723 text is going to cost you 168.00 (unless someone knows a better way to get it). Again, I don't have that kind of budget, so I haven't been able to read the text.

Note that BS8723 and NISO Z39-19-2005 are being used as input into a new ISO project ISO25964 to update ISO2788 or ISO5723. The current draft of the first part of the new ISO standard (ISO 25964-1 concerning Thesauri and Interoperability with other Vocabularies) is available for comment at http://drafts.bsigroup.com/.

The only significant divergence I see at this point is the BS8723 oddity of associating definitions with terms (a LexicalValue associated with a concept). Thus, to get from the concept object to a definition, one navigates through an associated term. As modeled in the UML, this implies that a concept can have multiple definitions that are bound to different terms used to label the concept. GeoSciML and SKOS bind definition directly with the concept. SKOS allows multiple definitions associated with a concept. GeoSciML allows 0..1 gml:description, which functions as a free text definition, and 0..* prototypes, which are links to resources that provide definition for the concept.

A draft document describing the use of SKOS to encode vocabularies is in the subversion repository at Use of SKOS for encoding GeoSciML vocabularies (MS Word 2003). Please post any discussion at SKOSVocabularyDiscussion.

-- SteveRichard - 2009-11-28

Discussion

Steve - this is awesome - perhaps should be shared with the SKOS mailing list. But I don't think we necessarily need it even for linking into the GeoSciML concept model. As far as I am concerned, SKOS provides an implementation that satisfies all (?) of the requirements of the GeoSciML controlled concept model. We still have the latter as part of our model, but we mark it somehow so that we know we don't generate a GML-style application schema representation. -- SimonCox - 2009-11-12

Very nice work Steve! Shall we add to the UML representation the Dublin Core properties used for describing the ConceptScheme (e.g. dc:date, dc:creator)?-- GuillaumeDuclaux - 2009-11-12

I threw it up as a starting point. My plan is to use this as a springboard to demonstrate how it implements the GeoSciML GeologicVocabulary model. Protege brings in all the Dublin Core properties, as well as rdf and owl document annotation as well. I have been working through recommendations on how those would be used for our vocabularies--they duplicate some of the SKOS properties, and don't make sense in this context in some cases. More to come, but I'll be on vacation the next couple days.

-- SteveRichard - 2009-11-12

I really don't like the way Protege handles SKOS. It randomly converts some skos:Concepts to owl:Things. When this random transformation occurs, Protege seems to loose some skos information like the skos:inSheme for the related concept. This cannot be undo if you maintain vocab with Protege. Until we find a skos-native tool to maintain the vocab, I'd recommend we maintain the Excel Spreadsheets only, and convert the files to SKOS as we update them. Regarding the Dublin Core properties I'd really like to read the summary of your investigations! Something we discussed earlier this week in Perth is the posibility to add the dc:contributors field to each skos:ConceptScheme, also individuals and organisations could be acknowledged for their work. -- GuillaumeDuclaux - 2009-11-12

Else use TopBraid for maintenance - I understand it doesn't lose stuff like Protege does. However, I've now come around to the idea that changing Concept → Thing(type=Concept) is not harmful (the semantics are not changed), but losing relationships clearly is (the semantics are changed).

BTW - could you generate a simple test case that demonstrates where Protege screws up? - I'm sure Simon Jupp would be very interested to see it and would probably try to fix it.

-- SimonCox - 2009-11-13

Like Simon, after an intechange with some of the Protege and SKOS group people, I also came around to agree that as far as RDF is concerned, the different encoding of skos:Concept ( Concept → Thing(type=Concept) ) doesn't change the semantics. It does mean that we can't use xslt to manipulate rdf.

-- SteveRichard - 2009-11-13

The scope notes on the DublinCoreAnnotation class in the EAP file are the best I've got for now. I'll post the EAP file as an interim solution. First question with dc is how to use contributor vs. creator.... In the vocabs as they exist now, the source informaiton is currently in historyNote; the source information has to be introduced at the level of each concept. What I'd like to work out with everyone is some 'best practices' on which elements to use for what in the inherited rdfs and owl elements, and any appropriate Dublin core elements, so we have a standard, documented approach to encoding the vocabulary, the source information, and links to related defining resources (OWL ontologies, photos, text documents....)
  • SKOS.EAP: UML conceptual model for SKOS , Enterprise Architect project file
-- SteveRichard - 2009-11-12

I like the idea of a standard, documented approach to encoding vocabs! Thanks for the EA file. -- GuillaumeDuclaux - 2009-11-12

Topic attachments
I Attachment Action Size Date Who Comment
SKOS.EAPEAP SKOS.EAP manage 2498.0 K 28 Nov 2009 - 07:08 SteveRichard UML conceptual model for SKOS, BS8723, GeoSciML GeologicVocabulary , Enterprise Architect project file
SKOS_conceptual__model.pngpng SKOS_conceptual__model.png manage 118.6 K 11 Nov 2009 - 22:58 SteveRichard UML conceptual model for SKOS
Topic revision: r16 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).