"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"
You are here: SEEGrid>Xmml Web>Iamg2003 (15 Oct 2010, UnknownUser)EditAttach

Proceedings, Annual Conference of International Association of Mathematical Geology, Portsmouth, UK 7-12 September 2003

Slides from presentation

Implementing distributed geoscience information systems using Open GIS Web Services

Simon Cox Robert Woodcock

CSIRO Exploration and Mining ARRC PO Box 1130 Bentley, WA 6102 Australia

Simon.Cox@csiro.au Robert.Woodcock@csiro.au


Abstract

We describe CSIRO experience in implementing distributed geoscience information systems using Open GIS Consortium Web Services (OWS). OWS specifications include:
  • Web Map Service (WMS) for raster maps;
  • Web Feature Service (WFS) for "vector" data;
  • Web Coverage Service (WCS) for continuous data;
  • Geography Markup Language (GML) for encoding and transport of geographic objects.
Together these provide an extensible, vendor-neutral architecture upon which a distributed geoscience information system may be deployed.

To effect this, a number of developments and customisations are required as follows:
  1. GML application language(s) for geoscience must be designed, and reflecting the community information model;
  2. easily configurable OWS software must be widely available, so that an organisation can expose its data sources through the standard web-hosted interface, using the geoscience XML language;
  3. client applications, capable of generating OWS requests and processing OWS responses must be developed. These must support the user-specified information processing tasks relevant to geoscience, including visualisation;
  4. suitable middleware - registries, catalogs, portals, for discovery and aggregation of services - must be established and maintained by suitable organisations and agencies.
  5. collaboration with relevant statutory agencies, to deploy OWS conformant information services.

Consideration of both technical and business aspects is required. It must be recognized that OWS is a technology that adds additional value to existing systems, and is not a complete replacement for them.


The web as a distributed information system

The emergence of the world wide web has had a profound effect on access to digital information. Increasingly sophisticated service patterns have developed to support this.

Early websites acted primarily as "file servers" delivering static pages.

On professional websites, this has been largely overtaken by "data services", where web-pages are generated dynamically. In this mode, the http request invoking the service is more explicitly understood to be a parameterised query. The websites are usually facades to databases or other data sources, and have been widely deployed by statutory data providers, such as geological surveys. Data is mainly transported as html (page-markup) with associated graphics such as gif and jpeg.

The next stage in the evolution of web information services is to return information in more structured formats, such as XML. This requires a stylesheet to convert it into a traditional webpage. The advantage, however, is that XML offers the possibility of routine re-use of data in analytical tools, rather than only supporting visual inspection.

The proposition is that data should be routinely made visible over http, in a form suitable for importing directly into processing software. As far as information users are concerned, there should be no operational distinction between local and remote data sources.

These two characteristics
  1. http request == parameterised query;
  2. data streams formatted to facilitate re-use
provide a basis for using the web as a generalised distributed computing platform.

Standard Web Service Interfaces

In order to allow interoperability between server and client software from multiple vendors, the request and response must be standardised. By focussing standardisation on the interface, each software developer, data custodian, and processing-service provider, can work to their particular strengths, while connecting with other systems for complementary services. The result of one service request can provide input parameters to a subsequent request to a different service. A complete decision-support system can then be assembled from loosely-coupled component services distributed across a variety of hosts, with interprocess communication based on messages over http.

In the domain of "geographic information" the Open GIS Consortium ( http://www.opengis.org/ ) is developing specifications to standardise the message interfaces. These include
  • Web Map Service (WMS) (de La Beaujardiere, 2001) for maps encoded as images;
  • Web Feature Service (WFS) (Vretanos, 2002a) for geographic objects or "vector" data;
  • Web Coverage Service (WCS) (Evans, 2002) for continuous data;
  • Sensor Collection Service (SCS) for live access to observations from sensors;
  • Geography Markup Language (GML) (Cox et al., 2003): XML components for encoding geographic objects for transport.

Along with some additional components, these provide the basis for an extensible, vendor-neutral architecture upon which a distributed geoscience information system may be deployed.

The OWS pattern

The behaviour of Open GIS Consortium Web Services (OWS) are exemplified by the operations comprising the Web Feature Service (WFS) interface (Fig. 1).

WFSdialog.png

Figure 1. UML protocol diagram for WFS

These are invoked as a series of messages, forming a dialogue between the client and service. The client first retrieves the capabilities of the service, which summarises the information available from this offering, primarily in terms of the feature types offered, and the geospatial area covered. The client then requests details of the format of the response for the feature types of interest. Finally, the client forms a suitable GetFeature request, and the server responds with an XML document in which the description of the selected objects is serialised. (Note that WFS may return other formats if available and requested.) Transactional operations (insert, update, delete) are also possible.

Note that the basic implementation of this protocol, where the messages are carried over simple http, is an example of the REST architectural style (Fielding & Taylor, 2002).

The processing chain: clients, middleware, and servers

A complete system utilising web service interfaces will be composed of information servers, clients, and the middleware and registries necessary to broker connections between clients and servers.

Many users will encounter OWS services through interactive desktop clients, including some capacity for visualisation, and a user interface through which conformant OWS requests are generated and routed to the appropriate server. However, client software will take many forms with various levels of client-side processing capability.

Some software will act as both a client and server: brokering requests from a client and generating new requests on one or more primitive data sources, performing some processing tasks, and then returning the result back to the end-user. This operation mode is required for discovery middleware, but will also feature in coupled simulation systems.

In the short term, at least, we would expect basic OWS servers to be deployed by statutory data providers, such as geological surveys. In this paper we focus on the server-side of the equation, which is an essential pre-requisite for the successful establishment of a web-hosted distributed information system.

Server software

The purpose of the listed Open GIS Web Services are to provide a web (http) “façade” or wrapper around a data service, with a standardised request and response syntax. Behind the façade, the source of the data may take many forms
  • a GIS (e.g. ArcGIS, Shapefiles)
  • a relational database (e.g. MySQL, Oracle, SQL Server)
  • an XML database (e.g. Oracle 9i, Virtuoso)
  • another XML document store (e.g. the file system)
  • an Object database (e.g. FracSIS)
  • a live source, such as a sensor or instrument
  • etc

The details of the data source are assumed to not be of immediate interest to the client, who merely wants to request maps, features, coverages, and other objects encoded in standard formats according to public data models. The main tasks of OWS software, therefore, are to:
  1. translate the OWS request into suitable operations to access the data source, and then
  2. retrieve the required information and either
    • serialise it in a valid GML document, i.e. as a collection of Features of the specified type
    • convert it into another advertised format (e.g. features as Shapefiles, coverage data in HDF-EOS, 3-D models in VRML), or
    • portray it as a map in the requested image format (e.g. gif, jpeg, png).

Since we are primarily interested in re-usable data, in this paper we focus on the details of WFS/GML option.

WFS request and the granularity of the response

There is a very close relationship between WFS and GML. A GetFeature request contains accessors encoded as XPath expressions (Clark & DeRose, 1999). These are formulated in terms of the Geography Markup Language (GML) representation of the feature. A WFS might therefore be characterised as a "virtual GML view" of a data source.

It is important to note, however, that WFS does not serve arbitrary GML, and does not support ad hoc queries. The response to a GetFeature request is required to contain features, as defined in the GML specification, and as discussed in detail below. A specific GetFeature request specifies
  1. which feature types are required
  2. which properties of each feature type should be included in the response
  3. which actual feature instances should be selected (using the auxilliary Filter Encoding language - Vretanos, 2002b).

Thus, the granularity of information access via a WFS is fixed by the definitions of the feature-types offered.

Geographic Features

A feature is a conceptual thing in the real world, or a digital representation of the thing, such as a Road, an Orebody, a Borehole, a Geological unit or a Measurement (event). In most cases a feature has an associated location, which may be a Point, or an extensive geometry. A feature has some kind of “identity”, and is interesting enough that you might want to encode its description and send it to someone or store it.

The feature model distinguishes Features and their properties, each of which has a semantic name, and value. The feature-type (Orebody, Borehole, etc) is defined by the set of properties for that type. For example, an Orebody might have a (3-D) outline, a commodity type, a reserve estimate, a host-unit, etc, while a Gravity Measurement will have a location and date, an instrument, a result, a terrain-correction etc. The type of the value of a property may be complex, such as a geometry, or may even be another feature (e.g. the destination of a Road may be a Town ).

Note that all properties of a feature are treated in the same way. Thus, a feature may have more than one geometry property, usually with different names. For example, a Borehole may have a collarLocation, whose value is a Point, while its shape may be represented as a Curve (Figure 2.).

Borehole.png

Figure 2. UML class diagram for Borehole

This approach contrasts in subtle but important ways with the conventional vector-GIS and CAD approach, in which the entitites of interest are bound to a single geometric representation, such as "points", "lines" or "polygons". The semantic type of the object may be captured by a specific "type" attribute, or more commonly associated with a "layer" and its associated attribute table schema.

If the feature-type contains many mandatory properties, and each of these has a complex value, then a single feature description returned by a WFS will be a large block of information. On the other hand, if the feature type is simple or if most of the properties are optional, then quite small packets of information can be returned.

GML implementation of features

There is a regular pattern for generating a GML conformant XML encoding of a Feature, starting from a UML representation of the feature type. For example, a Borehole instance using the model shown in Figure 2 is serialised thus:

<myns:Borehole gml:id="R456">
   <gml:description>Exploration hole</gml:description>
   <gml:name>north_r_679</gml:name>
   <myns:collarLocation>
      <gml:Point srsName="urn:ga:localGrid68" gml:id="c679"><gml:pos> ... </gml:pos></gml:Point>
   </myns:collarLocation>
   <myns:collarDiameter uom="m">0.15<myns:collarDiameter>
   <myns:shape xlink:href="http://my.big.org/borehole_surveys/s679"/>
   <myns:logs>
      <myns:IntervalLog>
         <gml:name>Lithology log</gml:name>
         ...
      </myns:IntervalLog>
      <myns:PointLog>
         ...
      </myns:PointLog>
      <myns:ContinuousLog>
         ...
      </myns:ContinuousLog>
   </myns:logs>
</myns:Borehole>

The name of the feature element indicates the Feature Type, corresponding to the Class name in the UML representation (here: myns:BoreHole). The content of a feature is a set of elements, each of which describes a property of the feature. Each XML element which is a direct child of the feature element is a property. The name of a property element indicates the property type, which corresponds to the attribute name or association roleName in the UML representation (gml:description, myns:collarLocation, myns:collarDiameter, etc). The value of a property may appear in-line, in which case it may be a simple literal (as shown in gml:name and myns:*collarDiameter*), or may be structured using XML elements (myns:collarLocation and myns:logs). Alternatively, the value may be given by-reference, as the value of a resource identified in a link carried as an XML attribute of the property element (as shown on myns:shape).

Application schemas and the GML namespace

Specific feature types that appear in concrete instances are defined in a GML Application Schema (GAS - Cox et al. 2003 clause 8), using the W3C XML Schema language (Fallside et al. 2001). The domain-specific components are defined in their own XML namespace (Bray, Hollander & Layman, 1999). The GAS imports generic components, including geometry, in the GML namespace ( http://www.opengis.net/gml ). For example, the borehole feature described above is defined in a namespace given the prefix "myns", with the pieces imported from GML having the prefix "gml".

The GML schema defines the components in the GML namespace. These cover geometry and topology, temporal objects, coordinate reference systems, coverages, definitions and dictionaries, measures, units of measure, observations, abstract features and collections (to be specialised for each application domain), dynamic features. GML's feature model is an implementation of an information model defined in abstract form in ISO 19109. Other parts of GML provide an implementation of concepts defined in other relevant ISO standards, in particular ISO 19107, ISO 19108, ISO 19111, ISO 19123.

Information communities and feature catalogues

An information community consists of people who share information for decision-making in a particular are of interest. A specific information community may be at least partly characterised by an information model, and the set of feature types that are of interest to its members. Agreement on the definitions of the community's catalogue of feature types (ISO 19101) will facilitate discourse within the community.

A GAS may thus be understood as having two main functions:
  • a conceptual role, to formally define the members of a catalog of feature types for a particular information community
  • an operational role, to validate XML instance documents describing information of interest to members of the information community.

The discussion here has focussed on the “structural” aspects of modelling the feature types of interest. Another key aspect of standardising the description of features is the value spaces for the various properties - the dictionaries, vocabularies and code-lists used within a community, from which valid values may be selected. These classifications or reference systems are critical for both description and discovery. Establishing a suitable regime for expressing them (XML documents?, XML schema enumerations? on-line registries?), and maintaining the lists and classifications is a significant task. The scope of this aspect of the modelling effort is beyond the scope of this paper.

Application to geosciences

The geosciences may comprise one or more information communities. The different sub-disciplines within the geosciences (geophysics, geochemistry, structural geology, sedimentology, stratigraphy, etc) each have their own terminology, which might be implemented as a feature-type catalogue. Specific applications (mapping, environment, mineral exploration, mine planning) form an alternative set of communities.

A schema may import components from more than one external namespace, so XML technology makes it possible to layer specialised languages progressively on top of more primitive ones. Thus, in the same way that GML defines components that are common across all geospatial disciplines (geometry, coordinate systems, etc), a basic geoscience markup language might define feature types that are fundamental for geology (rock units, timescales, etc, specialised 3-D geometries), and we might have a science observations language (specimens, procedures, time-series etc). Then specialised languages for mineral exploration, tectonics, hydrogeology, etc might be built on top of these, with feature types for the artefacts of their particular discourse.

GML for geoscience

CSIRO has been leading a project to design an XML encoding for information associated with mineral exploration. The eXploration and Mining Markup Language (XMML) ( http://xmml.arrc.csiro.au/ ) is being developed as a GML Application Language. The work has been sponsored by a combination of statutory agencies (geological surveys, mines departments), mining companies and technology providers.

A schema can be split over several documents, which allows topical modularisation, and also incremental development. Components developed in the XMML namespace ( http://www.opengis.net/xmml ) to date are primarily those which were of most immediate need to the project sponsors. These cover the following topics:
  • Attributed surfaces, curves, points
  • Boreholes
  • Gravity Measurements
  • Mineral Occurences
  • Geological timescales
  • Rock properties
  • Statutory Reporting templates (for Australian mines departments)
  • Assay reports
  • Finite element mesh

In addition, certain useful components have been implemented in conjunction with the XMML project, but have been assigned to different namespaces ( http://www.opengis.net/om )
  • Generalised observations, observation arrays and collections (Cox, 2003)

The strategy used in the development of the components has varied, but the following sequence is considered optimal:
  1. identify "feature types" on the basis of discussion with collaborators and sponsors
  2. determine the requirements for defining the particular feature type, from the point of view of the sponsor
  3. investigate whether there is an existing standard model or database schema for the feature type, which organises additional requirements
  4. sketch out an example instance document (or UML model) which accomodates the requirements, initially in streamlined form
  5. solicit feedback from stakeholders, and iterate
  6. develop an XML schema to describe the model
  7. refactor other XMML schemas to fit, and to promote common components
  8. produce UML diagrams as model documentation

The modelling methodology in developing the XMML feature catalogue, and the XML Schema in representing this, deviates from the ISO 19109 requirement for a strict Model Driven Architecture. It involves both pragmatic, bottom-up design (seen in the use of prototype XML instances at a very early stage, involving implementors whenever possible) as well as top-down analysis (re-use and evolution of common components, UML documentation). However, the use of object-oriented methods does encourage fine-grained modularisation. Thus, as well as importing GML, some "core" schema documents in XMML contain components common to many XMML feature types.

These include in particular:
  • geometry – optimised representations of certain 3-D geometries: parameterised plane, triangulated surfaces, hexahedron
  • common features – procedures and instruments, project, specimen, station, tenement

The exact composition of the core schemas, and the overall factoring of components between different schema documents, evolved during the development of the earlier feature types. Because it is still developing, the XMML schema documents are maintained in a CVS repository. CVS "tags" are used to capture versions used in implementations.

XMML was initially targetted at applications involving information transfer between organisations. However, following the logic outlined in the introduction to this paper, XML data streams may also figure in processing systems built from loosely coupled components. The Cooperative Research Centre for Predictive Mineral Discovery (pmd*CRC - http://www.pmdcrc.com.au/ ) is building a computation system for simulation of ore-forming systems in which web-services and XMML encoding are being used for some aspects of the interprocess communication.

Matching geoscience requirements to OWS

Within a distributed geoscience information system, many of the objects of interest may be packaged as features, and thus may be made visible through a WFS interface.

Certain items of interest, however, are not naturally objects each tied to a specific location, so it may not be natural to conceive of them as "features". "Material" (rock type, mineral, chemical phase) and reference material such as "time scale" are two important examples. Nevertheless, the basic Feature-property pattern can usually still be applied. This means that the WFS GetFeature/Filter request may still be used to retrieve specific components of a description.

For example, a materials database may use a WFS interface to allow the client to request specific thermodynamic parameters of a mineral phase, or specific mechanical parameters of a rock type.

For other cases we may prefer to either specialise or augment the "basic" WFS interface, or define new "OWS" interfaces, patterned on WFS.

Deploying an OWS-based geoscience information system

A number of developments and customisations are required to use OWS for Geoscience applications. In particular:
  1. GML application language(s) for geoscience must be designed, reflecting the community information model;
  2. easily configurable OWS software components must be widely available, so that an organisation can expose its data sources through the standard web-hosted interface, using the geoscience XML language;
  3. client applications, capable of generating OWS requests and processing OWS responses must be developed. These must support the user-specified information processing tasks relevant to geoscience, including visualisation;
  4. suitable middleware - registries, catalogs, portals, for discovery and aggregation of services - must be established and maintained by suitable organisations and agencies.
For many of these components solutions are under development, by software vendors or under other auspices.

The role of statutory agencies is likely to be critical in establishing a community of practice within which OWS-based services can be effective in the geosciences and mineral exploration. Consideration of both technical and business aspects is required. It must also be recognized that OWS is a technology that adds additional value to existing systems, and is not a complete replacement for them.

Appendix: Turnkey WFS products

Note that a number of implementations of general purpose WFS software are available, including which use various methods to customise support of a community GAS. Some of these only support 2-D data, and may restrict the allowed feature models (e.g. no complex property values).

In addition, FracSIS 3-D mining/exploration integration software ( http://www.fractaltechnologies.com ) will export a subset of XMML feature types.

(Product names are provided for information only, as an indication of the availability of multiple solutions from different providers. Additional products may be available. No endorsement or guarantee concerning suitability of the products named is intended or implied.)


InteroperabilityConsiderations

References

Bray, T., Hollander, D. & Layman, A. Namespaces in XML. W3C Recommendation REC-xml-names-19990114, 1999

Clark, J. & DeRose, S. XML Path Language (XPath), Version 1.0. W3C Recommendation REC-xpath-19991116, 1999

Cox, S., Daisey, P., Lake, R., Portele, C. & Whiteside, A. Geography Markup Language (GML) Implementation Specification version 3.0.0. OpenGIS® project document: OGC 02-023r4, 2003

Cox, S. Observations and Measurements version 0.9.2. OpenGIS® Project Document: OGC 03-022r3, 2003

de La Beaujardière, J. Web Map Service Implementation Specification version 1.1.1. OpenGIS® project document: OGC 01-068r3, 2002

Evans, J. Web Coverage Service (WCS) version 0.7. OpenGIS® Project Document: OGC 02-024, 2002

Fallside, D. XML Schema Part 0: Primer. REC-xmlschema-0-20010502; Thompson, H., Beech, D., Maloney, M., & Mendelsohn, N. XML Schema Part 1: Structures. REC-xmlschema-1-20010502; Biron, P. & Malhotra, A. XML Schema Part 2: Datatypes. REC-xmlschema-2-20010502 W3C Recommendation, 2001

Fielding, R., & Taylor, R. 2002 Principled Design of the Modern Web Architecture ACM Transactions on Internet Technology, Vol. 2, No. 2, May 2002, Pages 115–150.

ISO 19107:2003 Geographic information -- Spatial schema (ed. J. Herring)

ISO 19108:2002 Geographic information -- Temporal schema (ed. C. Roswell)

ISO DIS 19109 Geographic information - Rules for applicationschema (ed. S. Høseggen)

ISO DIS 19110 Geographic information - Methodology for feature cataloguing (ed. R. Rugg)

ISO 19111:2003 Geographic information -- Spatial referencing by coordinates (ed. J. Ihde )

ISO CD 19123 Geographic information - Schema for coverage geometry and functions (ed. C. Roswell)

Vretanos, P. Web Feature Service Implementation Specification version 1.0.0. OpenGIS® project document: OGC 02-058, 2002a

Vretanos, P. Filter Encoding Implementation Specification version 1.0.0, OpenGIS® project document: OGC 02-059. 2002b

-- SimonCox - 02 Jul 2003

Topic attachments
I Attachment Action Size Date Who Comment
Borehole.pngpng Borehole.png manage 12.5 K 01 Jul 2003 - 17:18 SimonCox Added inheritance from Abstract Feature
WFSdialog.pngpng WFSdialog.png manage 11.8 K 01 Jul 2003 - 15:22 SimonCox Revised labels and cropped
cox.PPTPPT cox.PPT manage 349.0 K 06 Feb 2004 - 17:52 SimonCox OWS presentation
Topic revision: r27 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).