"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Designing a schema for Observations

Contents

Related pages



Modelling

General principles

An Application Schema is specified in UML following the profile specified in ISO 19103 and ISO 19136 (OGC 05-108r1), and described here in SchemaFormalization#ISO_TC_211_Profile_of_UML. This allows classes (such as "Feature Types") to be specialized by the addition of named properties (attributes and navigable association-ends - see SchemaFormalization#GML_Profile_of_UML) of any type available in the modelling environment. In principle, these are any class visible in the modelling IDE.

For representation of Feature Types (from the ISO TC211 metamodel) that will be serialized using GML, modelling requires specialisation of classes derived from a generic Abstract Feature Type. Specialization involves adding (typically) geometry and other property types drawn from the ISO model (i.e. from the pre-defined GML component library).

Of course, starting with the premise that "everything is an object" is not very helpful, nor will it promote the use of common models that will underpin practical interoperability. This cookbook represents an aid to working with some of the key patterns in spatio-temporal data models, with practical examples how to use some pre-defined component libraries.

The HollowWorld template includes a variety of components pre-loaded, including a substantial subset of the ISO 19100 "Harmonized Model", making all of these components nominally available. The Harmonized Model was included in the template to maintain maximum consistency with the standard modelling methodology specified in ISO 191903 and ISO 19109.

Components available for automatic schema generation

In practice, however, in order to permit automatic schema generation using the ShapeChange tool, the types that are available are limited to:
  1. classes defined in the Application Schema(s) or Package(s) currently under development
  2. interface classes, with pre-defined XML bindings, as indicated in the ShapeChange Configuration document.

Thus, every "component library" (currently) needs to identify the specific components that may be referenced by inclusion or specialisation in order to include the components within the resulting model. The mechanism of importing these "interface classes" into the UML->GML conversion tool (ShapeChange) is still somewhat onerous, and currently under review. A big advantage of the HollowWorld approach is that these issues have been dealt with in advance.

Standard interface classes are the global types and elements defined in GML, which provide XML implementations of classes from the ISO 19100 series, as described in Annex D of ISO 19136 (OGC 05-108r1).

For O&M purposes, additional interface classes are contributed from the following package/namespaces.
  • sweBase/swe (OGC Sensor Web Enablement draft specifications),
  • O&M/om (current scope of GML Observation and Measurements worked into ISO style package),
  • geometryExtensions/geo (Extensions to GML geometry model)
  • sampling/sa (Sampling regime artefacts, possibly to be merged as a module of O&M)

The ShapeChange Configuration document provides the definitive set of interface classes. Application schema designers should restrict their use of external components to entries from this list as follows:
  • classes named as the value of the "type" attribute in elements labelled BaseMapEntry may be used as base classes (parents) in derivation of new classes by specialization
  • classes named as the value of the "type" attribute in elements labelled ElementMapEntry may be used as the target of navigable associations in new classes derived by specialization
  • classes named as the value of the "type" attribute in elements labelled TypeMapEntry may be used as the type of attributes in new classes derived by specialization

Simple types for simple models

Of particular interest are the following subset of TypeMapEntry's. These are all implemented as simpleContent types and are thus suitable for simple (denormalised) content models.
  • Integer
  • Measure - number with a scale
  • RelativeMeasure - number with a scale and optional qualifier indicating "greaterThan" etc
  • Boolean
  • CharacterString - text
  • GenericName - term with an optional indication of an authority or source vocabulary
  • ScopedName - term with a mandatory indication of an authority or source vocabulary
  • DirectPosition - spatial position as a set of N coordinates with an optional indication of the N-dimensional coordinate reference system

Governance

There is a direct correlation between packages stereotyped "Application Schema" and namespaces in the output schema. Namespaces reflect the "governance" of the Application Schema packages - who manages the package. Multiple packages may have the same namespace, but namespaces can only be introduced by importing a package that implements components in the namespace.

The HollowWorld template draws on standard ISO and GML components where possible, but has also codified the use of additional components that will ultimately need to be promoted into these standards regimes.

GML Package

The HollowWorld template includes a package representing GML as an Application Schema. However, the classes from this package should generally only be used in cases where there is no corresponding class from the ISO package.

For example, there is no ISO equivalent of gml:AbstractFeature, for use in situations where it is desired to assert an association with an arbitrary feature type. (Note that it is not necessary for classes with the stereotype FeatureType to explicitly specialize gml:AbstractFeature - this derivation is implied by the stereotype.)

It is always preferable (and ISO conformant) to use the ISO conceptual UML classes rather than the GML implementation classes in models, since this gives the greatest possibility of multi-platform implementation. To assist the Application Schema designer follow the ways of modelling rectitude, all classes in the GML package that do have ISO equivalents have the name of the ISO class as an Alias in the class definition.

Specialization patterns

The interface classes provide a large set of potential types for specialized properties. It is therefore not realistic, or useful, to attempt a comprehensive description of the ways these might be used in the design of Application Schemata. Hence, the following instructions are based primarily on a learn-by-example approach.

Specializing the O&M Model

Summary of the O&M Model

The ObservationsAndMeasurements information model separates the various concerns related to the process of making observations and recording their results. In particular it distinguishes between
  • the Observed Property or feature attribute which is being observed - defined semantically independent of its representation
  • the Feature of Interest or subject of the observation - often a Specimen or Station
  • the Procedure or instrument used to make the Observation
  • the Observation event, including
    • the result of the observation, expressed as a value on a scale

The observation event itself is defined in a relatively generic manner, such that the same model (and serialization) can be used for most instances from all domains. Standard specializations of the AbstractObservation class restrict the datatype of the result for a limited set of types covering most cases (including Measurement and CategoryObservation), and complex results are supported by the generic Observation class whose result type is Any.

The other components will often require specialization for an O&M based Application Schema on a domain-by-domain basis. A primary concern is to fully specify the observed properties for the domain of interest - see #Specifying_semantic_components. Most commonly, however, domain-specific classes are developed by specialization of the following base classes:
  • from the sampling package, described in SitesAndSpecimens
    • Stations - sampling regimes are often described using domain- or campaign-specific parameters
    • Specimens - particular domains use specific, often standardized, collection- and preparation-procedures
  • from the O&M package, and described in ProceduresAndInstruments
    • Procedure Types and Procedure Events - instruments and procedures may be highly specialized, and may also require event-specific parameters to be recorded

Following the precedent of the ISO 19100 series, classes in the WQDP schema carry the WQ_ prefix so that the names are unique independent of packaging details.

Specialized sampling components

  • WQDP sampling artefacts:
    WQDP sampling artefacts

For the WQDP, several specialized sampling features are required:

WQ_Station

WQ_Station specializes sa:Station.
  • A mandatory geographicPosition property is added. This records the latitude and longitude directly, augmenting the GM_Point value of the position property inherited from Station, which will normally record the position in a projected reference system.

Recording geographic position
This example illustrates some subtleties associated with the recording of position:
  • the position property on sa:Station has the type GM_Point (ISO 19107). This means that its value is a geometry object, or a reference to a geometry object. This is implemented as a gml:Point, so a data instance might appear as follows:
<sa:position>
   <gml:Point gml:id="p1">
      <gml:pos srsName="urn:x-ogc:def:crs:EPSG:6.3:62836405">-30.7025065 134.1997256</gml:pos>
      <!-- The value of the srsName attribute identifies the CRS that EPSG gives the identifier "62836405" -->
   </gml:Point>
</sa:position>
or alternatively
<sa:position xlink:href="http://some.geometry.service.org/points/p1"/>

  • on the other hand the geographicPosition property uses DirectPosition (ISO 19107) which is a Data Type (i.e. without identity) and is implemented as gml:DirectPositionType. This is an XML Schema Simple Content type - i.e. the position coordinates are encoded directly as a space separated tuple, e.g.
<wqdp:geographicPosition srsName="urn:x-ogc:def:crs:EPSG:6.3:62836405">-30.7025065 134.1997256</wqdp:geogrpahicPosition>
  • finally, the (optional) elevation property inherited from Station allows a value and datum for elevation to be encoded separately, if the position property uses a 2-D CRS. In this case, the value of the srsName identifies the vertical datum. For example:
<sa:elevation srsName="urn:x-ogc:def:datum:EPSG:6.3:99876">563.</sa:elevation>
<!-- a single coordinate records the offset from the datum indicated -->

The appropriate strategy depends on whether the coordinates of the position are required to be reusable, and whether the definition of a suitable reference system is available.

WQ_Bore

WQ_Bore specializes WQ_Station.
  • The pipe property identifies a specific pipe at the station.

WQ_Specimen

WQ_Specimen specializes sa:Specimen.
  • the purpose records the reason the specimen was collected
  • the sampledBy and samplingMethod properties record details concerning the way the specimen was obtained
    • note that the latter provides a specific (mandatory) slot for such information in addition to the (optional, repeatable) processingStep property inherited from sa:Specimen, which has a om:ProcedureEvent as its value
  • the (optional) relativeLocation allows the sampling location to be recorded in a form that reflects the typical method used in obtaining this information � viz. as an offset from a reference point

Note that the location property inherited from sa:Specimen is mandatory. Thus, in a data instance, this should either
  1. encode the absolute position computed from the relativeLocation information
  2. use the value of the xlink:href attribute to carry a "nil" value, such as urn:x-ogc:def:nil:OGC:withheld

The value of the relativeLocation is packaged in a special DataType RelativePoint developed for this purpose, having the following properties:
  • referenceFeature - the origin, typically a Station
    • note that this class attribute has the Tagged Value inlineOrByReference="byReference" so that the value may only be given as a reference to an instance described elsewhere
  • offset - a vector describing the absolute position relative to the origin

Packaging complex information
This example illustrates the method used to bundle a set of information into a complex object, when it is only meaningful considered as a package. A new class is created, whose properties capture the information items.

When the class represents identifiable features or objects, it should be stereotyped FeatureType or ObjectType. In GML-conformant XML this is implemented using a content-model that extends gml:AbstractFeatureType or gml:AbstractGMLType with a sequence of sub-elements, one for each additional property. On the other hand, when the class does not represent a classification of identifiable features or objects, it should be stereotyped DataType. In GML-conformant XML this is implemented using a content-model comprising a simple sequence of sub-elements, one for each property.

Specialized procedures

  • WQDP procedures:
    WQDP procedures

WQ_ObservationProcedure

For observed properties whose values are represented using numbers, the procedure used will often have known thresholds of detection. These are determinand-specific.

WQ_ObservationProcedure specializes om:ObservationProcedure adding:
  • determinandDetails, whose value is a WQ_DeterminandSensitivity which binds values for detection limits to a determinand reference.

Specifying semantic components

Observed Property

Most domains are fundamentally concerned with a specific set of properties for which values are obtained by observation. These properties are known variously as the observables, measurands, determinands or analytes. In many cases there is an existing list, taxonomy or "ontology" of these, organized using principles that are related to the domain. This may be expressed using various technologies, such as a table or spreadsheet, a dictionary, or using a knowlege representation notation, such as OWL.

In order to be used within the context of ObservationsAndMeasurements, the minimum requirement is that every property of interest is identified by a URI. This allows the "by-reference" mode of the standard GML property pattern to be used, regardless of whether the value is available in a GML-encoded form or not.

Standard vocabularies

A number of "standard" lists of observables are available.

The Phenomenon schema XmmlSVN:swe/trunk/sweCommon/current/phenomenon.xsd is provided as part of the "SWEcommon" components used in the O&M schema. This provides an encoding for the description of an observed property in an XML document, as entries in a specialized GML Dictionary. It is tailored specifically for properties and property-series that are defined by applying constraining parameters to fundamental properties - see XmmlSVN:trunk/sweCommon/1.0.30/examples/phenomena.xml for an example dictionary.

Vocabularies, Codelists and Dictionaries

The management of vocabularies is a complex task.

ISO 19136 Geography Markup Language includes class stereotypes <> and <> and allows for the latter to carry the Tagged Value asDictionary to indicate encoding using a method supporting schema-validation of data instances or using and external dictionary.

In a wider context, ISO 19135 Procedures for registration of items of geographic information describes a comprehansive model for the maintenance of vocabularies.

Finally , there is some (old) discussion on this page: CodeListsAndDictionaries

Domain specific vocabularies

The documentation of a domain-specific ontology is usually sufficient for users within the domain. Note, however, for cross-domain work the ontology must be easily available, and it will usually be necessary to provide mappings to and from other sets.

WQDP determinands

The WQDP determinands are available in a spreadsheet. Each determinand has a numeric identifier (the DETERMINAND_CODE), whose structure also describes its position within a hierarchy. Assigning URIs to these items may be accomplished by re-using the code, for example urn:x-ogc:def:prop:WQDP:1001001002000000 for Rainfall - Cumulative.

Which observation type?

The WQDP determinands fall into three categories according to the way that their values are expressed:
  • measures - numbers with a unit-of-measure giving the scaling (including ratios expressed as %)
    • this includes all the chemical concentrations (mostly contaminants)
  • counts - integer values representing a frequency of occurrence (mostly organism counts)
  • presence/absence - boolean values

The generic om:Observation may be used to record observations concerning any determinand, with the result-type indicated in the XML serialized instance using the standard xsi:type attribute. However single observations of each of these properties may be represented more compactly in an om:Measurement, om:CountObservation, and om:TruthObservation, respectively.

An observation series for any determinand should be represented in either composed in an om:ObservationCollection, or more compactly using a Record type in the result of a generic om:Observation.

Packaging a dataset

  • WQDP dataset package:
    WQDP dataset package

A complete dataset requires descriptions of the Stations, Specimens, Procedures in support of the Observations. WQ_Datatset is a convenience class that serves to bundle a collection of these into a single document.

Summary of WQDP model

  • WQDP specializations of O and M model:
    WQDP specializations of O and M model

 
Topic attachments
I Attachment Action Size Date Who Comment
WQDP.PNGPNG WQDP.PNG manage 71.6 K 05 Jan 2006 - 16:37 SimonCox WQDP specializations of O and M model
dataset.PNGPNG dataset.PNG manage 16.8 K 05 Jan 2006 - 16:17 SimonCox WQDP dataset package
procedure.PNGPNG procedure.PNG manage 16.2 K 04 Jan 2006 - 20:37 SimonCox WQDP procedures
sampling.PNGPNG sampling.PNG manage 26.3 K 05 Jan 2006 - 10:57 SimonCox WQDP sampling artefacts
Topic revision: r18 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).