"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"
You are here: SEEGrid>Xmml Web>AcQuire2003 (15 Oct 2010, UnknownUser)EditAttach

acQuire Conference 2003, Perth, WA 16-17 October 2003

The Assay Data eXchange Project

Simon Cox - Simon.Cox@csiro.au

Robert Woodcock - Robert.Woodcock@csiro.au

CSIRO Exploration and Mining ARRC PO Box 1130 Bentley, WA 6102 Australia

Bill Withers - Bill.Withers@metech.com.au

MeTech PO Box 933 Canning Bridge, WA 6153

David Hester - admin@pindansoftware.com.au

Pindan Software PO Box 155 Hillarys, WA, 6025

The exchange of assay data is a key element of most mineral exploration projects. Exchange takes place between many parties involved in the process, including between exploration companies and analytical service providers, within exploration companies, between providers, and with regulators.

Currently data is exchanged in a variety of electronic and hardcopy formats, which are often based on the provider's LIMS system. SIF is a fixed-column text format which is closest to an industry standard. However, it has a number of known deficiences, and despite a number of revisions during the 1990s, is considered incomplete by many users. Thus some exploration companies have developed more specific reporting formats which allow additional information which they consider essential to be carried. The Newmont reporting format (1992) is a good example of the latter.

The eXtensible Markup Language (XML) from W3C has emerged as the most widely accepted method for encoding structured data in a text document. Tools for manipulating XML are widely available. It was thus considered desirable to develop an XML based document format for the exchange of assay data. Because it is a self-describing plain-text format XML may be useful in statutory reporting and other archival uses (i.e. data transfer across time rather than between organisations).

A number of XML-based encodings for scientific data have been developed. These are mainly concerned with serialising tabular data. However, while assay data are typically summarised in tabular form - samples vs analytes - a variety of exceptions are usually present, mostly related to quality control procedures. These exceptions mean that a simple table holding homogeneously typed values cannot capture all the required information.

The XMML project, coordinated by CSIRO, has been developing XML encodings for a variety of geoscience and exploration data. XMML is based on the Geography Markup Language from Open GIS Consortium. This is an XML implementation of components for geographic information specified in the ISO 191** series of standards from ISO TC/211. It is compatible with the web service interfaces that have been developed through OGC, such as Web Feature Service, which supports retrieval of data from remote datastores via a standard web interface. OGC is supported by all the major software vendors, including GIS and CAD software, and by many government organisations. Support for OGC specified interfaces and encodings is being incorporated in mainstream software from Oracle, ESRI, MapInfo, Intergraph and many others.

Of particular interest for Assay Data Exchange is a related OGC recommendation on Observations and Measurements (O&M), which includes detailed support for metadata associated with individual measurements. This provides a solid basis for implementation of an Assay Data Exchange format.

O&M implements a model which is derived from measurement theory. It treats each individual measurement as an object - the "measurement event". A measurement is made
  • regarding a particular target object - the specimen
  • concerning a particular observable or phenomenon - the concentration of a specified analyte
  • using a particular procedure - the combination of specimen preparation, delivery method and instrument
  • at a particular time and by a particular operator
and results in
  • an estimate of the value of the phenomenon.

Considering the classic samples vs. analytes table:
  1. each cell in the table contains the result of a single measurement; its "coordinates" are (analyte,sample)
  2. each row represents a suite of properties of a single object i.e. the specimen
  3. each column represents the variation of a single property across the suite of specimens
There are thus straightforward transformation relationships between the different views of the complete information. Each view is suitable for a different purpose, in part related to the processing chain or workflow, respectively:
  1. database insertion or update
  2. object description
  3. pattern or anomaly detection

For Assay Data eXchange (ADX) we have implemented the O&M model following the XMML/GML encoding pattern. Public version 1 of ADX provides one top-level document, the Report. A Report has five main sections:
  1. requestor information
  2. provider information
  3. the set of analytical and specimen preparation procedures
  4. the set of specimens on which analyses were performed, including relationships between these where applicable (splits, etc)
  5. the set of measurements, keyed to the specimens and analyte procedures, and including relationships with other measurements where applicable (replicates, repeats, etc).
Note in particular that the description of the specimen is kept separate from the measurement. This allows multiple measurements to be recorded against each specimen, and is quite distinct from common practice where they are often bundled as "sample".

In order to demonstrate the standard processing technology available using XML, in particular to implement the simple transformations outlined above, we have developed XSLT scripts to transform ADX Reports to generate
  • a HTML approximation of the Newmont tabular format, but with some dynamic functionality through hyperlinks,
  • SIF.
An importer for the acQuire system has also been developed.

An ADX Report is intended to support the primary use-case of a laboratory reporting the results of a batch to the client. The document structure is specified formally using the W3C XML Schema Description Language. Variations on the document model may be developed later for other use-cases, such as statutory reporting, intra- and inter-company data exchange, assay requests, etc.

The ADX Report is a robust, flexible framework for assay reporting. Because it is based on a model from generic measurement theory, it will be easy to enhance it in future to accomodate related applications, for example in medical areas. Because it is XML it is compatible with web-delivery technologies, and easily accommodated in a variety of workflows. XSLT transformations allow a variety of additional delivery formats to be generated from the same source document. The modular XML technology means that ADX may be easily upgraded, and XML Schema allows a rigorous formal machine-processable definition of the document format.

For more information and examples, please see the XMML website

  • ADX Report document structure (XML Spy schema view:

-- SimonCox - 03 Oct 2003
Topic attachments
I Attachment Action Size Date Who Comment
ADXatAcquire2003.zipzip ADXatAcquire2003.zip manage 8003.7 K 14 Oct 2003 - 13:18 RobertWoodcock Powerpoint and Movie
adx-Acquire2003-Notes.pdfpdf adx-Acquire2003-Notes.pdf manage 2618.7 K 14 Oct 2003 - 13:18 RobertWoodcock Powerpoint Notes - PDF
adx-Acquire2003-Slides.pdfpdf adx-Acquire2003-Slides.pdf manage 5324.9 K 14 Oct 2003 - 13:18 RobertWoodcock Powerpoint Slides - PDF
adxReport.pngpng adxReport.png manage 11.1 K 30 Sep 2003 - 15:45 SimonCox ADX Report document structure (XML Spy schema view
Topic revision: r8 - 15 Oct 2010, UnknownUser

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).