Standards-based design of information models and encodings
An Application Schema
formalizes the information model for an application domain (ISO 19109:2005). A general methodology for the development of an application schema is provided in ISO 19109. A key premise is that communication and primary interoperability concerns centre on a community which shares a model or view of their world, and that the design of an application schema should be scoped to an information community.
The ISO 19100 suite of standards developed by ISO Technical Committee 211, providing tools for the development and implementation of geospatial tools and services. An overview of these is provided at IsoTc211Standards
. The work of ISO/TC 211 is complemented by the Open Geospatial Consortium, as described in GeospatialStandardsContext
The core of an application schema is a catalogue of feature-types (ISO 19110:2005), following the General Feature Model
(ISO 19101:2002, ISO 19109). The application schema is formalized
using UML according to the profile described in ISO TS 19103:2005, and in GML-conformant XML following the rules described in ISO 19136:2007. HollowWorld
is a UML template which supports the ISO 19103 profile, and includes utility components provided by other ISO 19100 standards. FullMoon
is a tool to process a UML model to produce a GML implementation and human-readable documentation of the schema.
Follow the links for more information on the details of the methodology and tooling.
Standards-based methodology for developing a geoscience markup language
Abstract for EGU 2008
Markup languages have been developed for data transfer in a variety of earth science disciplines. Most of these have been developed using an informal methodology – typically guided by a data model implicitly defined in some existing document or database, but with the XML schema often designed directly using ad-hoc patterns, or sometimes created automatically by some proprietary toolkit. This often leads to a language that is efficient for a single application or within a workgroup or community, but with limited scope for interoperability across domain boundaries. The latter is a serious constraint to the use of data from diverse sources in cross-disciplinary investigations.
A uniform methodology, based on standards published by Open Geospatial Consortium and ISO
, has been developed and applied in the design of GeoSciML
. The method is based on the Object Management Group's Model Driven Architecture (MDA), with model design in UML using the General Feature Model from ISO 19109, the use of components from other standards in the ISO 19100 series
, and production of the XML schema following the encoding rules specified in ISO 19136. The resultant encoding shows a literal and explicit relationship to the UML model. This is unlikely to be as compact as hand-coded special cases, but is consistently structured across similar models. Full structure and meaning is preserved, and compactness is easily dealt with using standard compression techniques. Furthermore, the use of standard components for elements that are common across domains ensures maximum interoperability.
To assist in the use of this methodology, we have developed two tools:
- HollowWorld - a UML template with ISO 19100 components, stereotypes, and tags pre-loaded, plus some other cross-domain components;
- FullMoon - a UML processing framework, based on application of sets of rules against the XMI representation of a model.
We use a UML design tool that allows direct binding to one or more SVN repositories. These host the various UML packages that are under separate governance arrangements. This overcomes an important limitation of most UML-based methodologies, which effectively treat the entire model as a single artefact.
Three rule-sets are available For FullMoon
- validating the UML model with respect to the profile described in ISO 19136
- generating GML-conformant XML schema according to the rules in ISO 19136
- generating human-readable documentation of the model, in the form of an HTML frameset.
Use of these tools has allowed the GeoSciML
team to develop and maintain the model as a single normative artefact (XMI). Implementation views in XML Schema and HTML documentation are generated automatically at significant release points. This addresses two key issues with ad-hoc approaches: ensuring normative and descriptive content are consistent across maintenance activities, and the ability to support convenient cross-reference between the conceptual model and the XML encoding.
Prepared by: SimonCox
FullMoon: a framework for processing UML information models
Abstract for e-Research Australasia 2008
Geographic information is ubiquitous, so the use of unified standards is desirable in order to enable interoperability between applications and systems. The primary standardization bodies in this area are ISO Technical Committee 211 (ISO/TC 211) and the Open Geospatial Consortium (OGC)
. ISO/TC 211 is focuses mainly on abstract standards including information models for cross-domain concerns, while OGC works on implementations, including Geography Markup Language (GML). Significantly, GML does not try to implement a generalized language usable as-is by all application domains, but rather provides a set of cross-domain XML components to serve as a framework for the implementation of specialized languages, based on the models provided in a number of ISO/TC 211 standards, such that aspects that are common across domains are implemented in a common way. GML is an XML grammar written in W3C XML Schema (WXS), so its use in a so-called "Application Schema" is through the standard <import> and <extension> mechanisms provided by WXS.
However, developing an Application Schema that is to be implemented using GML is complex. The ISO/TC 211 process requires that the information model is developed and formalized in UML (Class and Package diagrams) following a strict UML profile (patterns, stereotypes, tagged values) and with dependencies on ISO packages for any common elements (e.g. geometry, time, spatial functions). The UML model is then converted to an XML schema according to a set of rules defined in ISO 19136. FullMoon was developed to automate this XML Schema generation.
is a framework for processing and transforming XML documents. It was originally designed for processing large UML models using XML mapping rules defined in ISO 19118, 19136 and 19139 standards
. FullMoon processes the XMI (XML Metadata Interchange) format representation of a model, generating XML schemas (and other views), with the mapping rules implemented as XQuery scripts. Models may be large, and their XMI representation is highly verbose, so using traditional DOM and SAX parsers can be problematic. Efficient performance in FullMoon is achieved by using XML-DB engine to cache the model. This provides XQuery access to the XML infoset.
GML specifies XML encodings of conceptual classes defined in the ISO 19100 series, but application schemas may also re-use externally governed packages from other sources, which may have existing canonical WXS representations. FullMoon supports this through a register for the application schema (which associates each externally governed package with an XML namespace and schema location) and a mapping table for each package (which associates each UML class with its representation as WXS element declarations and type definitions). The registers and tables are accessed by URI, so may be web-hosted at authoritative locations.
The FullMoon framework is a "rules-driven" application that facilitates introducing, maintaining and enhancing of existent rules within rule sets. XQuery scripts implement logical rules used to transform input model into other XML formats. Rulesets have been implemented for (1) XML Schema (2) HTML documentation. A ruleset has also been developed to test a model's validity and conformity to the standards. Detailed reports of non-conformities identify and locate errors in the UML application schema. The use of rule sets makes the FullMoon framework flexible to upgrade and easy to maintenance, allowing to introduce new conformance tests and/or processing rules.
Prepared by: PavelGolodoniuc, SimonCox, NickArdlie
Application schema modeling for interoperable geospatial information using the ISO 19100 series of standards
Extended abstract for e-Research Australasia 2010: Download PDF
Geographic information is inherent to many application domains in various disciplines and constitutes an integral part of Earth sciences, including geology, geophysics, meteorology, hydrology, oceanography, and soil science. Communication of sophisticated geographical data requires the use of complex technologies that enable interoperable geospatial information exchange channels. The primary authorities in geographic information standardization are ISO Technical Committee 211 (ISO/TC 211) and the Open Geospatial Consortium (OGC), which govern the abstract standards including information models for cross-domain concerns, architectures for distribution of geospatial services and implementation of ISO standards through service interfaces, data models and encodings.
The “language” that all participating parties should understand in the communication process is defined by GML Application Schema, which will conform to best practice guidelines and international standards if it is developed using the Hollow World modelling environment. Solid Ground toolkit may be especially useful during the modelling process in accomplishing some routine operations and refactorings. Application schema defined in conceptual terms of a particular domain may be then easily transformed into its physical representation – a set of W3C XML Schemas, which is achieved by the use of FullMoon XML Processing framework.
This modelling approach, which places the structural definition of the information at the centre of the design process, is known as Model Driven Architecture, which makes the information model the only artefact that has to be maintained by the governing body. These technologies provide a complete set of tools required to design and implement an Application Schema using the ISO 19100 series of international standards.
Prepared by: PavelGolodoniuc, SimonCox
Geospatial Information Modelling for Interoperable Data Exchange
Paper for e-Science 2010: Download PDF
Geographic information is inherent to many application domains in various disciplines and constitutes an integral part of Earth sciences, including geology, geophysics, meteorology, hydrology, and soil science. Communication of sophisticated geographical data requires the use of complex technologies that enable interoperable geospatial information exchange channels. The “language” that all participating parties should understand is defined by an Application Schema, which will conform to best practice guidelines and international standards if it is developed using the Hollow World modelling environment. Application schema defined in conceptual terms of a particular domain may be then easily transformed into its physical representation – a set of W3C XML Schemas, which is achieved by the use of FullMoon XML Processing framework. This modelling technology provides a complete set of tools required to design and implement an Application Schema using the ISO 19100 series of standards.
Prepared by: PavelGolodoniuc, SimonCox