"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Project Scenario Implementations


The requirements are based on the scenarios described here .

Querying Requirements

A large clarification about all the below - in terms of use cases for end user clients Rob A needs to verify that these are correct as there may be smarter/different ways of doing things.

There are four types of query suggested by the scenarios:

  1. Get me all the specimens for a given area, ie a query by a bounding box. Step 1 of scenarios 1 and 2 (and implied as step 1 of scenario 3).
  2. Get me all the geochemistry measurements for a list of specimens. Steps 2,3 and 4 of scenario 1, steps 3,4 and 5 of scenario 2, and steps 1 and 2 of scenario 3).
  3. Get me all the specimens and their measurements that have a particular analyte value > X (note that this may actually be a combination of a bounding box and anlyte query in practise because you'd be subsetting from an existing set). Step 2 of scenario 2.
  4. Get me all the geochem measurement data for a particular specimen. Step 2/3 (the numbering is wrong on the slides) of scenario 3.

Due to the limitations of Geoserver at the moment we will have to create four featur types to handle each of these requirements. (Actually this is a limitation of the WFS spec implied one-to-one mapping from query to Feature Type - we will simply declare elsewhere that the "feature types" are query objects that return the desired target feature type. This is a useful outcome of the project that will need formalisation through a change request to the WFS spec. -RA)

I've also arbitrarily at the moment limited each query to return nor more than 5000 rows from the database (not equivalent to 5000 features) because an errant query could slow everything down (probably not our databses because they're pretty robust) and it's pretty easy to think of quries where you could return all available data eg a bounding box query that bounds Australia, an analyte query that asks for all samples where SiO2 > 0. We probably need to think of a more sensible number (something like the number of rows returned divided by typical number of analytes would give you the rough number of specimens). (We need to reconcile geoservers MaxFeatures with the backend query in a meaningful way. Its tricky since the number of responses from a join is hard to predict in relation to the number of features)

1 Bounding Box Query

The bounding box query only requires that we return the Specimen feature information, not the Specimen and Measurment information because all the client would be doing with it at this point is plotting points on a map (and you might possibly want to know a name or ID of the specimen). You also pass a lot less information for if you just pass the Specimen feature pattern and so a client can work more efficiently.

So I think for the bounding box query you'd be able to return:

 <GeochemSpecimen gml:id="GA_1_90980153">
  <place>North Hinckley Range</place> 
  <relatedFeature xlink:href="urn:x-seegrid:items:exceptions:inapplicable" xlink:role="urn:x-seegrid:items:featuretypes:xmml:Project" /> 
 <gml:Point gml:id="GA_1_90980153_P">
  <gml:pos srsName="AMG66">-26.045584 129.009332</gml:pos> 
  <positionMethod>100k topo map</positionMethod> 
  <relatedObservation xlink:href="#GA_1_90980153_Ag" /> 
  <relatedObservation xlink:href="#GA_1_90980153_Al" /> 
  <relatedObservation xlink:href="#GA_1_90980153_As" /> 
  <relatedObservation xlink:href="#GA_1_90980153_Bi" /> 
  <relatedObservation xlink:href="#GA_1_90980153_Ca" /> 

Rob you may want to clarify tha it will be more efficient to use the Specimen pattern as opposed to the whole thing.

Note that the "relatedFeature" property is optional - I got the impression that for some agencies, keying specimens to a "project" was considered desirable, so I showed the home for this. But it looks like cruft if the value is always "imapplicable"! -- SimonCox - 12 Nov 2004

This is only meant to be indicative for the moment Simon. I'll post the real attempt later on. -- StuartGirvan - 12 Nov 2004

This cuts down on what is required in the info and schema XML files for this particular feature type.

There are still some issues with getting this to work because of the way the Bbox is appended to the bypass SQL.

Here are the info and schema files for the Bbox FeatureType and here is the XML filter query you'd need to send (which I nicked from Brendon, thanks!). Because the issues need to be sorted out for bounding box queries the feature type won't work at the moment (but the schema.xml file is fine and so is the top part of the info.xml file). I'll update these as soon as we have a solution.

The remaining queries all return both specimen and measurment information and thus require both the Specimen and Measurement features. Thus they all have the same select part of the SQL clause and minor differences in there where clauses. And they all use the full Specimen and Measurment feature type schema pattern and thus the same schema.xml file.

2 List of Specimens Query

This will reuire the Bypass SQL statement to be be able to handle a list of n Specimen IDs in the query.

Because we don't know what the solution looks like yet I can't post an info.xml file for this solution.

But I would imagine the XML filter query would look something like this.

<wfs:GetFeature service="WFS" version="1.0.0"
  <wfs:Query typeName="xmml:geochemBySpecmineIDList">
            <ogc:Literal>106099, 5402, 111111, 333333, 4444444</ogc:Literal>

and would sub into a placeholder in the query like this:

and r.rockno in (?) ------------> and r.rockno in (106099, 5402, 111111, 333333, 4444444)

although we have sort of tried this and it didn't seem to work, so it may end up looking more like

and (r.rockno=106099 or r.rockno=5402 or r.rockno=111111 or r.rockno=333333 or r.rockno=4444444)

but you'd have to chaneg the way placeholders work.

In this case you would not use a placeholder for the ID - it would need to be managed by the client to use and AND clause around the OR's. I'll do some testing on this tonight to try to make sure it all works.

There may be a way around the need for this type of query if the client can somehow maintain the query that this list was built on. That is if the client could maintain that the required list comes froma bounding box query or from a combination of bounding box and analyte query it would simply reuse the query to regerenate the list of specimen and measurement features.

- Yes - that would be possible too - but we would need to make sure that the same "feature type" is appropriate in the scenario - or specify that we know the original query can be run against a different feature type. At any rate we are successfully testing how WFS and GML stacks up against real-world issues! -- RobAtkinson - 16 Nov 2004

This is not particularly elegant but then again neither is sending an extremely long list of ID numbers.

3 Analyte (Analyte/Bounding Box) Quey

If the requirement here is simply to query on analyte then the required info.xml and query XML are pretty simple.

I've arbitrarily restricted the number of rows this returns to 5000. 5000 rows is equivalent to about 166 samples, returns an XML/XMML file of about 4.5 gig and took our internal server 3 minutes to build. The slow part appears to be the transformation into the complex XMML pattern.

Anyway I suspect you'll actually need this in conjunction with a bounding box so that you can subset onscreen correctly. And I can't put files up for that until we get the Bounding Box stuff working correctly.

4 Specimen ID Query

This is the simplest of the query patterns and is already working. Attached is the info.xml file and the xml query .

So the upshot is that if we solve the Bounding Box problem I think that's all the feature type/queries we'll need. As noted above all of this requires clarification from Rob A.

-- StuartGirvan - 12 Nov 2004

Parameter Lists

Or ontologies/semantics.

The only authoritative list we should need for the queries will be a list of analytes and they are listed below.

The master list used for xmml:analyte is here: https://www.seegrid.csiro.au/subversion/xmml/trunk/enumerations/LUTgeochemistry.xsd - note that this is a union of several sub-lists, and has been assembled primarily from inputs from GA and ALS Chemex. The list below is (I think) a subset of this, which (ultimately) might be used to configure a menu interface, but why not use the XMML enumeration? If the user asks for an analyte that is not found, then they get no result! -- SimonCox - 12 Nov 2004

Definitely a good idea to use the list in xmml, apologies I didn't realise there was one. -- StuartGirvan - 12 Nov 2004

The follwoing could serve Rob as an arbitrary list of units against the various analytes so that a user will know what they are querying against. Eg I'm looking for values of Au > 20 ppb as opposed to 20 ppm (which could make me very rich). I don't think this is listed anywhere under XMML so here's a suggestion drawn from GA's database. Conversions will have to be done both at the client for repsonses and at the back part of the WFS setup for requests.

to be returned as originalUOM ? -- AndyDent - 02 Dec 2004

I sat with Lesley to check on the Elements you are likely to want to query against (asa Geochemist/Geologist) and the probably units you would want to use, hence things like noble gasses and some of the more obscure elements are not here. This is from a very practial point of view - if you have a big disagreement with something that's been left out let us know. Helpfully this will also limit the number of unit conversions we're going to have to do.

I don't know if you can subset the enumeration list https://www.seegrid.csiro.au/subversion/xmml/trunk/enumerations/LUTgeochemistry.xsd using these values, but I think the full enumeration list is too inclusive (if that makes sense).

-- StuartGirvan - 24 Nov 2004

Final Parameter List for use in demonstrator

This list should ideally be a derived subset of what is describe above, however due to time constraints we are simply listing it here.

Element Symbol Element Name Units
Ag Silver ppm
Al2O3 Al2O3 wt%
As Arsenic ppm
Au Gold ppb
Ba Barium ppm
Be Beryllium ppm
Bi Bismuth ppm
CO2 CO2 wt%
CaO CaO wt%
Ce Cerium ppm
Cl Chlorine ppm
Co Cobalt ppm
Cr Chromium ppm
Cu Copper ppm
F Fluorine ppm
Fe Iron ppm
Fe2O3 Fe2O3 wt%
Fe2O3TOT Fe2O3 Total wt%
FeO FeO wt%
Ga Gallium ppm
H2O+ H2O plus wt%
H2O- H2O minus wt%
Hf Hafnium ppm
K2O K2O wt%
La Lanthanum ppm
Li Lithium ppm
MgO MgO wt%
MnO MnO wt%
Mo Molybdenum ppm
Na2O Na2O wt%
Nb Niobium ppm
Nd Neodymium ppm
Ni Nickel ppm
Os Osmium ppb
P2O5 P2O5 wt%
Pb Lead ppm
Pd Palladium ppb
Pt Platinum ppb
Rb Rubidium ppm
Re Rhenium ppb
S Sulphur ppm
Sb Antimony ppm
Sc Scandium ppm
Se Selenium ppm
SiO2 SiO2 wt%
Sn Tin ppm
Sr Strontium ppm
Ta Tantalum ppm
Te Tellurium ppm
Th Thorium ppm
TiO2 TiO2 wt%
Tl Thallium ppm
U Uranium ppm
V Vanadium ppm
W Tungsten ppm
Y Yttrium ppm
Zn Zinc ppm
Zr Zirconium ppm

-- StuartGirvan - 11 Nov 2004

Proposed featureTypes

List of proposed feature types to be implemented.

Bounding box is an optional filter for all featureTypes listed

Definite requirements

featureType Filters Data returned
GeochemSpecimen None GeochemSpecimen only
GeochemSpecimenByAnalyteAndValue Analyte type and analyte value greater than, third filter parameter dictates UOM of passed value paremeter GeochemSpecimen only
GeochemMeasurementsByAnalyteAndValue Analyte type and analyte value greater than, third filter parameter dictates UOM of passed value paremeter GeochemSpecimen and GeochemMeaurement
GeochemMeasurementsByID Single specimen ID GeochemSpecimen and GeochemMeasurement

Possibly required

featureType Filters Data returned
GeochemSpecimenByAnalyte Analyte type GeochemSpecimen and GeochemMeaurement
GeochemSpecimenByIDList List of specimen IDs GeochemSpecimen and GeochemMeasurement data

-- TerryHannant - 24 Nov 2004

Querying Feedback

Based on a phone conversation with RobAtkinson today:
  1. the four queries listed at the top of this page have been amalgamated into three as querying by a single ID and querying a list of ID's is handled by the same query
  2. there appears to be a missing query - for optimal performance of the maps, Rob needs just the features, not measurements:
    Get me all the specimens that have a particular analyte value > X Step 2 of scenario 2.

-- AndyDent - 02 Dec 2004
Topic attachments
I Attachment Action Size Date Who Comment
AnalyteValueQuery.xmlxml AnalyteValueQuery.xml manage 0.8 K 16 Nov 2004 - 12:22 StuartGirvan  
BboxQuery.xmlxml BboxQuery.xml manage 0.7 K 12 Nov 2004 - 13:10 StuartGirvan  
SpecimenIDQuery.xmlxml SpecimenIDQuery.xml manage 0.6 K 12 Nov 2004 - 13:10 StuartGirvan  
geochemByAnalyteValueinfo.xmlxml geochemByAnalyteValueinfo.xml manage 2.3 K 16 Nov 2004 - 10:56 StuartGirvan  
geochemByBboxSchema.xmlxml geochemByBboxSchema.xml manage 2.3 K 12 Nov 2004 - 13:16 StuartGirvan  
geochemByBboxinfo.xmlxml geochemByBboxinfo.xml manage 1.6 K 12 Nov 2004 - 13:10 StuartGirvan  
geochemBySpecimenIDSchema.xmlxml geochemBySpecimenIDSchema.xml manage 6.6 K 12 Nov 2004 - 13:26 StuartGirvan  
geochemBySpecimenIDinfo.xmlxml geochemBySpecimenIDinfo.xml manage 2.2 K 12 Nov 2004 - 13:10 StuartGirvan  
Topic revision: r19 - 15 Oct 2010, UnknownUser

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).