"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Preliminary GeoSciML1 Schema and Instance Documents

Methodology

The mapping from UML models to GML is described in SchemaFormalization and GmlImplementation. A detailed procedure for generating a GML-compliant XML schema summarized in HollowWorld and OandMCookbook.

See also the paper by Boisvert et al. from the USGS DMT 2004 workshop GML Encoding of NADM C1.

Feature Types

See repository: https://www.seegrid.csiro.au/subversion/xmml/GeoSciML/trunk/schema/ or XmmlSchemaRepository:GeoSciML/trunk/schema/ or XmmlSVN:GeoSciML/trunk/schema/

Schema usage patterns

Names and identifiers

Re: Names and identifiers, including "unit code": these are all essentially minor variations on the same kind of thing. The standard place for these is gml:name.

Note that any GML Object or Feature may have an unlimited number of gml:name properties, reflecting the fact that the same object often has different identifiers assigned by different authorities. In order to assert "this is the name or identifier assigned by authority XYZ corporation" you should use the codeSpace attribute on gml:name (i.e. the scope identifier). Absent this, then the value is implicitly under the authority of the organization or service supplying the document.

See LabelsAndHandles and also GeologicVocabulary and OgcURNScheme.

-- SimonCox - 25/29 Nov 2005

Optional vs. unknown property values

In some earlier discussion, BruceJohnson wrote:

  1. Made all properties of GeologicObject optional. Because I have added VocabRelation as a subclass of GeologicObject, having a required 'purpose' property no longer seemed to make sense.
  2. Made the EarthMaterial property, 'color' optional. One might not always know the color, or it might not be definitive.
  3. Made most CompoundMaterial properties optional, for the same reasons. I'm working on creating a vocabulary instance document, which is a bit different from doing specific sample descriptions. I think we need to be a bit looser in our required properties than we were thinking at our last meeting.

-- BruceJohnson - 13 Sep 2005

I have reset the cardinalities of the properties of EarthMaterial and CompoundMaterial to match the UML. I'm sorry I haven't checked validity of all sample instances (ran out of time before I had to go get my daughter from daycare). But we should resist the impulse to make properties optional when they are conceptually required. If the values for these properties are unknown, then that should be indicated explicitly, instead of just omitting the property altogether.

Depending on the content-type, this should be done as follows:

Properties modelled as associations with objects-with-identity

(i.e. the target class is derived from GML Feature or Object)

Use a "nil" URN as the value of the xlink:href attribute (see OgcURNScheme) e.g.

<classification xlink:href="urn:x-ogc:def:nil:OGC:unknown"/>

Properties with simpleContent values (ScopedName or Measure)

If the value is a ScopedName, use one of the OGC nils as the value: e.g.

<role codeSpace="http://www.opengeospatial.org">urn:x-ogc:def:nil:OGC:unknown</role>

For properties whose value is a simple Measure, we shall use CGI_MeasureType (from CGI_basicTypes.xsd) with the appropriate value for the relativeMeasure attribute: e.g.

<length relativeMeasure="nil:withheld" uom="m">0.0</length>

Properties with CgiValue values

Set the "qualifier" attribute to an appropriate nil value: e.g.

<genesis>
<CGI_TermValue>
<qualifier>nil:unknown</qualifier>
<value codeSpace="http://www.cgi-iugs.org"></value>
</CGI_TermValue>
</genesis>

<porosity>
<CGI_NumericValue>
<qualifier>nil:missing</qualifier>
<principalValue uom="%">0</principalValue>
<plusDelta uom="%">0</plusDelta>
<minusDelta uom="%">0</minusDelta>
</CGI_NumericValue>
</porosity>

-- SimonCox - 29 Nov 2005/21 Dec 2005

The above examples fall into two categories; those for which it is possible to use an explicit null value (such as the OGC defined ones which allow distinguishing e.g. an unknown value from a witheld value) and the numeric ones which use the "value is something greater than 0" technique. The latter seems unsatisfactory for two reasons. First, surely some measures (e.g. temperature) could be negative as well as positive and we would have to pick some different relative specification (e.g. greater than -273 degrees Celsuis), which would depend on the particular measure. Second, we don't have the ability to specify a specific null reason, the user just has to deduce that, because the range given allows any possible/realistic value then it is not being specified for some reason.

-- MarcusSen - 29 Nov 2005

Yep - any suggestions? Note that an important goal is not to confuse "optional" with "nillable". For all the properties with variations on textual values, the solutions seem OK. The problem seems to arise with measures. Perhaps we should add the nil-reason values (inapplicable|unknown|missing|withheld) to
  1. the relativeMeasure enumeration
  2. the CgiValue qualifier enumeration

Would that work for people?

-- SimonCox - 20 Dec 2005

Schema Design and Factoring Issues

Namespace and packaging

Important reference - John Herring's [[http://portal.opengeospatial.org/files/?artifact_id=12592&version=1][discussion of the interactions between namespaces, versions, schemaLocation, etc]

  • GeoSciML version 1.x will use one primary namespace: http://www.cgi-iugs.org/xml/GeoSciML/1
    • the versioning strategy is consistent with practice described in OGC 05-062r3
    • Concerning possible future upgrades, the following rule applies:
      • 2. Each minor version of any such schema that retains the namespace of the predecessor shall not introduce any new XML types or elements that could not be safely ignored by existing application based on the previous minor version, insuring a strong form of backward compatibility.
    • components from other namespaces (e.g. http://www.opengis.net/om) may also constitute a "canonical" part of GeoSciML but will be incorporated using the WXS import mechanism and will thus retain their own namespace names
    • there is only one GeoSciML schema - this is it! This is the schema that will be the basis of TestBed2 (though maybe not all of it will be exercised in the testbed)
  • the schema is factored amongst several schema documents, corresponding to packages in the model
  • the schema document location (path) includes the complete version number - initially 1.0.0, moving to 1.0.x for bug-fix releases, and 1.1.x for exntensions that do not change the scope of the schema, and thus deserve the same namespace name
  • the schema documents are hosted in the GeoSciML publish/build repository which is here: TBC. Note: this is not the same as the developers version repository, here: XmmlSVN:GeoSciML/trunk/schema/

Complete example here.

-- SimonCox - 06 Oct 2005 -- SimonCox - 16 Jan 2006 -- SimonCox - 07 Sep 2006

Collections

The generic collection element gsml:Gsml is available for packaging objects from the GeoSciML schema. It is described on GeoSciMLCollection.

-- SimonCox - 06 Oct 2005

Instance Document Examples

I've added three more instance document fragments today. They all show how a GeologicUnit description would look in the XML file:

  • XmmlSVN:GeoSciML/trunk/instances/geologicUnit-0.xml: This file contains three LithostratigraphicUnit descriptions. The first has no composition information, the second includes composition information by reference to a CompoundMaterial element in an external file, the third includes in-line composition information.

  • XmmlSVN:GeoSciML/trunk/instances/geologicUnit-1.xml: This file contains the same three LithostratigraphicUnit descriptions as the previous file, but I've added the information about the relationships between the three units using the partOf attribute.

  • XmmlSVN:GeoSciML/trunk/instances/geologicUnit-2.xml: This file contains the first of the three LithostratigraphicUnit descriptions used in the previous files. I've shown how one could include the shape information in-line with the unit description by including one polygon. I doubt that one would want to do this when transmitting an entire map or database; one would probably put the shape information in a separate file and use the classification attribute to refer to an external unit description. But, this form might be useful for sending a single polygon from a Feature Server.

If you look at these documents, you'll see that the signal-to-noise ratio is very low; there's a lot of infrastructure for very little actual data. The files are approximately twice the size of the NADM files with the same data. I suppose that's at least in part because we are being more rigorous in defining data types, but it still bothers me.

To make these instance documents validate, I had to make a few changes to the schema. You'll see those changes outlined under the Schema Issues header above. A new version of the schema is here.

-- BruceJohnson - 29 Sep 2005

Two more instance documents were loaded on Friday. The first, XmmlSVN:GeoSciML/trunk/instances/geologicStructure-1.xml, is a fragment containing two structures, a contact and a fault. The fault shows how in-line spatial data could be included. This might be how a simple fault with a single line segment would appear as a WFS feature. The contact shows how external spatial data could be referenced, in this case 4 line segments which are in a separate file. The second document, XmmlSVN:GeoSciML/trunk/instances/contact-1.xml, contains the spatial data for the 4 contact line segments referred to in the first document. Again, these are just fragments of instance documents that show how the instance document might be formatted. I am using them to check that the GeoSciML 1.0 testbed schema actually works as we intended. No changes to the schema were required for this set of instance documents.

-- BruceJohnson - 03 Oct 2005

All instance documents and the current version of the 1.0 schema have been moved to Subversion. I believe I have changed all of the appropriate links on this page. Please use Subversion for future modifications to any of these documents.

In converting to Simon's refactored version of the schema, I also changed the mechanism for specifying spatial data to use gml:posList. I had originally used the gml:coordinates property, which is deprecated in the current GML. Simon used gml:pos in his version of the instance documents. I switched to gml:posList because it's a bit more compact (and I also wanted to see how it worked).

-- BruceJohnson - 05 Oct 2005

I added two more instance documents. These are copies of the previous two GeologicStructure documents showing a slightly different way of putting the spatial data in an external file. In this case, instead of putting the MappedFeature and its attributes in an external file, I kept the MappedFeature and its attributes in the main file and only put the actual spatial data (in this case gml:LineString) in the external file. The new instance documents are:
  • XmmlSVN:GeoSciML/trunk/instances/geologicStructure-2.xml
  • XmmlSVN:GeoSciML/trunk/instances/contact-2.xml

-- BruceJohnson - 05 Oct 2005

Yet another pair of instance documents added to the set today. These are a geologic unit example and an earth material example using 'real data' provided by GSV. I've separated the geologic units and their corresponding earth material components into different documents, using xlink:href to link the two. They could just as well have been included in a single document. The new instance documents are:
  • XmmlSVN:GeoSciML/trunk/instances/GSV-geo-unit-1.xml
  • XmmlSVN:GeoSciML/trunk/instances/GSV-geo-mat-1.xml


Examples now in SVN XmmlSVN:GeoSciML/trunk/instances

Topic revision: r46 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).