"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

The XMML Project: Introduction


Project Goals

  • XML encoding for data related to geoscience, focusing on mineral exploration and mining
  • broad uptake by regulators, software vendors, data-providers and users

Management

Chronology

The XMML project was initiated by CSIRO and Fractal Graphics in 2000. The requirement was to develop a data tranfer encoding to facilitate the exchange between applications on the desktop, between networked computers, organisations, and possibly over time (archiving). It was decided to use an XML-based encoding on the grounds that this was likely to become the dominant basis for information exchange in web-based environments, which were becoming ubiquitous.

The project was announced at an AMF symposium in May 2000, and attracted support from several mining companies, geological surveys and mining industry consultants. The WA State Government provided substantial funding through the Minerals and Energy Research Institute of WA (MERIWA) and work began in late 2000.

Project results were restricted to project sponsors and collaborators until the end of June 2003. From that time the XMML schemas and documentation were made publicly available.

Sponsorship

As part of their contribution MERIWA managed the sponsorship funds following their standard procedures.

Communications

Reports

Quarterly ProgressReports were issued in accordance with MERIWA practice.

A frequent form of external information regarding XMML has been in the form of talks and presentations.

Open access

In order to maximise the likelihood of widespread adoption, it was planned from the outset to make XMML freely available. However, during development a "sponsors only" policy was applied, relaxed slightly to include collaborators whose technical contributions were expected to be useful.

All XMML Schemas are made available under the terms of the XMML Software Notice.

Website

A project website has been used as a central communications point. XMML is now managed as an activity within the Solid Earth and Environment GRID and the site is located at http://www.seegrid.csiro.au/xmml/.

Starting early in 2003, we used TWiki software for the development site, which allows any registered user to edit the pages on the website. TWiki also has a notification system, whereby users may elect to receive periodic email advising the names of topics that have been updated. (For the XMML Twiki, notification frequency is daily.) This notification functionally replaced the xmml-dev mailing list.

From 1st July 2003, following the project plans as agreed by project sponsors, access to the website was made available on request to any interested party. From early March 2004 the site was moved to a new server providing unrestricted read-access. Access for editting is granted on request to the site adminstrators, but is not usually withheld. However, the goal of more flexible and frequent input from stakeholders was only partly realised.

Schema documentation

The website hosts the XMML documentation, which became available to sponsors and other interested parties as it was developed. The Project Report is composed of a snapshot of pages from the website, collated and converted to PDF.

Schema repository

From late in 2002, the XMML schema documents were maintained in a version control system. CVS was used during development. The schemas were moved to Subversion concurrent with the release of version 1.0 of XMML. Web interfaces are available to both of these systems: Selected collaborators were also given direct access to the CVS during development.

Influences

A set of links to RelatedInitiatives was compiled early in the XMML project. Of particular importance are the following projects:

CSIRO Data Translator

The concept for XMML grew partly out of an earlier CSIRO project to construct Data Translator software for moving exploration and mining information between desktop applications. This project was managed through AMIRA as project P467.

The similarity is that the Data Translator used a single (object oriented) data model as the common node, with the connection to each legacy application requiring only import and export from the application to the data model. In this way translation between N application is converted from potentially N-squared problem to something of order N only. The Data Translator data model was in turn strongly influenced by the Spatial Data Transfer Standard (SDTS). It is esentially a geometry-centric model, in which each object is tied to a primary geometric representation. The data model provided generic support for other attributes.

Data in the Data Translator's common model only exists transiently, and is never serialised in an externally accessible form, or otherwise persisted. In the XMML project, we moved away from implementing the actual translation software, and instead concentrated on defining an intermediate model and in particular its serialisation in XML. The proposition is that software interoperability within exploration and mining can be assisted in part through a standardised encoding for information objects in that domain used at the point of transfer. Applications software can interoperate by importing and exporting data using this encoding.

AUSDEC Geoscience Data Model

An earlier project sponsored through AMIRA P431 developed a "set of integrated data models ... for the geoscience industry in Australia". These were presented through Entity-Relationship diagrams, together with a Data Dictionary. Of particular note are models for BOREHOLE, DEPOSIT, and SAMPLE which overlap with the scope of XMML. The P431 data model is referred to in several places in statutory reporting standards. However, while P431 provided examples of how to implement the model in database tables, and some tabular notations, no reference transfer syntax was developed.

North American Data Model

The Geological Survey of Canada, US Geological Survey and the organisations representing state and provincial geological surveys in North America have been engaged for several years in a project to define the North American Data Model for geological map information. The interest is in capturing information that can be used to generate a conventional map representation of geology, based on the contemporary interpretation of the "primary" observations.

The principle is that observations have persistent value and thus should be represented in a manner suitable for archive. In contrast, interpreted objects are always transient: additional observational constraints may become available, or the conceptual model basis may be modified or replaced. Thus, the observations and the conceptual model are the primary interest, while the representation of interpreted objects is secondary.

In contrast, XMML is neutral with respect of the value of the information. The object types for which there are XMML representations are those that were prioritised by the project sponsors. XMML is a transfer model and is thus presumed to be a "snapshot" of the data, representing the intentions of the sender at the time of sending. Data represented in XMML should be used by the receiver in accordance with any relevant agreement between the sender and receiver, which may be explicit or implicit.

NADM Conceptual Model Version 1.0 contains UML class diagrams for a number of specific components. The material model in XMML is strongly influenced by NADM.

The NADM project is now moving on to defining an XML based transfer syntax for the model. It is expected that this will be developed following the GML patterns, and thus may be integrated with XMML.

ISO/TC 211 and Open GIS Consortium

From an early stage, the XMML project was committed to leveraging developments in the general geospatial information arena as far as possible. The main non-proprietary initiatives in this area come from Open GIS Consortium partly in association with ISO Technical Committee 211 (Geographic Information).

ISO/TC 211 has defined information models for geographic objects based on Features and Coverages, as well as several aspects of supporting information, such as spatial information (geometry), spatial referencing by coordinates, temporal, metadata etc. An overview of the relevant standards from ISO/TC 211 is provided.

A key assumption of the ISO/TC 211 program of work is that geographic information types are usefully partitioned by application domain. The information community within each domain is responsible for the Application Schema for that domain, one aspect of which is the Feature Type Catalogue. Within this framework, we are working within the geoscience, mineral exploration and mining domain. The XMML project aimed to develop a catalogue of feature types for this domain.

ISO/TC 211 focussed its initial work at the "abstract" level, defining quite detailed models and principles, but using neutral syntax (often UML) and only providing domain specific examples non-normatively. It is not possible to implement software based on these standards immediately.

On the other hand, Open GIS Consortium is primarily a vendor consortium, with the goal of developing implementation standards for interoperable geospatial information. Since the success of the Web Mapping Testbed in 1999, this has primarily been in the form of
  • service interfaces: standard syntax for requesting geospatial data on the web, Web Map Service (WMS), Web Feature Service (WFS), Web Coverage Service (WCS), etc.
  • data encoding: XML-based encodings for general geography (GML), location-based services (XLS), coordinate reference systems (subset of GML), etc.
These standards specify a specific implementation method (http; URI's; XML) targetted at a specific distributed computation platform (the World Wide Web)

Some of the OGC specifications are explicit realisations of abstract models developed by ISO (GML provides components that implement ISO 19109's General Feature Model, and ISO 19107 Spatial Schema, etc), and some of the OGC specification are now undergoing standardisation in ISO (WMS and GML in particular). But the originating organisation is usually quite clear from the level of abstraction of the standard.

Of particular significance to the XMML project are

XMML methodology

Models and priorities

Components developed for XMML were primarily based on information models provided by project sponsors, as follows:
  • the borehole model was primarily driven by requirements provided by Fractal Technologies based on their understanding of mining software systems, with additional input from several places
  • the gravity model was based on the ASEG/GA gravity observation format
  • the mineral occurrence model captures essentially the information represented in GA's MINLOC system
  • the statutory reporting templates were developed with reference to GGIPAC's guidelines
  • the material model was based on requirements from WMC and Snowden's, with important input from NADM
  • the Geological Time Systems was developed in consultation with the CHRONOS consortium
  • the geometry-centric feature types were developed with Fractal Technologies and CSIRO, to accommodate a 3D model construction process
  • some basic components for geological maps were developed with reference to geo-datamodels under development in GGIPAC and BGS
  • the ADX module was primarily based on Newmont's digital assay format, with additional input from WMC and Acquire
  • GPML2 (GPlates Markup Language) was developed in collaboration with the GPlates consortium (University of Sydney)
  • GeoSciML (GeoScience Markup Language) is under development primarily by BGS, in collaboration with the XMML project
  • models for geophysics observations are based on ASEG's GDF2 format

Throughout the development process, opportunities were sought to factor out components that could be re-used in multiple modules. This led to the development of
  1. "abstract" or "base" components, from which several of the concrete feature types were derived (e.g. PositionedFeature, BoundedFeature, Coverage, Observation, Procedure, Station, Specimen)
  2. utility components that could be used as values of properties in several places, including:
    1. temporal complexes (required for time-series)
    2. tuples and tables, with a tuple-description syntax to capture the equivalent of "column headings"
    3. specialised 3-D geometry components, including
      • a compact form for triangulated surfaces
      • standard simple solids for block and finite element models

A number of the latter represent general components missing from GML but critical to XMML. The project team, based at CSIRO, has been involved in the development and standardisation of GML, so some issues that arose during XMML development that are potentially of more general interest are feeding back into the broader geospatial standards.

Schema language

XMML is a GML Application Language, developed following the Rules for Application Schema described in the GML 3.1 specification. The reference version of the XMML schema is expressed using the W3C XML Schema Language (WXS).

Notations used in the documentation

For most modules we show the data model using a graphical notation generated by the XML Spy IDE. A summary of this XmlSchemaNotation is provided.

Rules for converting GML schemas into UML class diagrams are described in the GML 3.1 specification and are outlined in FeatureModel, GmlFeature and GmlProperty. Those have been applied informally for some modules of XMML, but any UML diagrams should be treated as non-normative documentation of the schemas, using UML as a convenient notation. The WXS version of the schemas is always normative.

Achievements

XMML Schema

For the release of Version 1 of XMML, the feature types of primary interest are:

  • Borehole with logs
  • Geochemistry result (Assay data and Statutory reporting data)
  • Geological Timescale
  • Geometry features based on Points, Curves, Surfaces and Solids, extensions with basic properties, and extensions suitable for map-features
  • Geological boundaries of various types
  • Mineral occurence
  • Observation, Gravity Measurement

Many other XML elements are declared in the XMML schema with "global" scope. While these are mostly "support" elements, used in the context of one of the primary feature types, they may be used as the root element in a document when required. In particular the following are provided:

  • Coverage, Pointset coverage, Interval coverage, Curve coverage, TIN coverage, Rectified Grid coverage
  • Project, Specimen, Station, Tenement, Material

In addition, a number of components and feature types have been developed to support numerical modelling (finite elements, plate rotations).

Note that the feature types here are dominantly observations and artefacts, rather than geological objects, which are necessarily the results of interpretation.

As discussed above, typical "geophysics" observations have lagged pending the refinement of coverage encodings. Support for time-complexes and regular time-grids is currently absent from GML. These are being developed in XMML, and it is expected that support for a number of geophysics observations will become available very soon, with GDF2 being used to define requirements.

GML

XMML depends on GML 3.1. While the benefits of inheriting components from other frameworks, or even basing XMML on a more generic language, were clear from the beginning of the project, at that time GML was not sufficiently mature. GML version 1 was available as an Open GIS Consortium "Discussion Paper" only. Nevertheless, the primacy of OGC as the home for implementation standards for geospatial information was clear, alongside ISO/TC 211 for abstract standards. Hence we engaged with OGC in order to accelerate the development of GML as a basis for XMML. This led to SimonCox becoming a co-editor of GML 2.0 issued in January 2001, co-chair of the GML revision working group in January 2002, co-editor of GML 3.0 issued in January 2003 and GML 3.1 issued in March 2004. Note that GML 3.1 served as committee draft of ISO 19136. SimonCox also was editor of the OGC recommendation paper on Observations and Measurements, and acts on several other OGC working groups.

Data Grids

XMML is recognised as one of the pre-eminent examples of a community-based application schema, in the sense described in ISO 19109, expressed as a GML application language. Such application languages have a critical role in supporting loosely coupled information-grids or spatial data infrastructures. The XMML project and community has been seminal in the emergence of the Australian Solid Earth and Environment Grid (http://www.seegrid.csiro.au) and the Information Services components of the AEON network (http://www.aeon.org.au).

Future activities

At the scheduled end of the project in mid-2003 it was clear that interest in XMML development had not been exhausted. This page summarises the state of XMML for the final report of the MERIWA managed project. Further work on XMML is proceeding through a number of avenues:

  1. the Predictive Mineral Discovery CRC is using XMML as one of the primary tools for interprocess communication in its distributed software framework. This is requiring the development of additional feature types and related models and encodings. In particular a number of schemas that are included in the version 1 repository are concerned with describing configurations for numerical modelling exercises (e.g. Finite Element modelling)
  2. an international collaboration for the development of a geoscience data model has been established under the auspices of IUGS as a working group of the Commission for Geoscience Information. XMML is one of the primary inputs to that activity, and XMML project leader SimonCox is the convenor of the model and syntax task group. SimonCox is also a member of the Council of the Commission
  3. the XMML schemas for geological time scales have been adopted by the NSF Chronos project, which is closely tied in with the IUGS International Commision on Stratigraphy and is also affiliated with GEON
  4. the XMML module for geochemistry reporting, known as Assay Data eXchange (ADX) is being marketed through Metech/Acquire and adopted broadly by the LIMS industry
  5. an XMML-based exchange format for tectonic plate reconstruction data "GPML 2" is expected to be adopted for several of the software packages in that area, including GPlates.
  6. XMML is contributing to several initiatives working to develop information service grids or "spatial data infrastructures" for the geosciences and exploration, including SEEGrid and AEON.


The XMML project acknowledges the generous financial support provided by Fractal Technologies, CSIRO, MERIWA, WMC, Snowdens, Placer Dome, GA, BGS, GGIPAC and Metech.
Topic revision: r34 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).