"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Contents

Related pages


Overview

This page now contains an overview that hopefully reflects the state of play for the project.

Discussions are now moved to GeoserverMappingOpenIssues.

As stuff gets resolved, it will be moved to either GeoserverMappingInProgress or SchemaMappingFutures

Old discussions are saved in GeoserverMappingDiscussionArchive

Query Mapping (decoupling the query and the output schema)

The use of constrained queries is a real-world necessity and it leads to a abstraction between the feature type returned from a WFS and the allowable "filter" parameters.

WFS does not currently recognise this abstraction - so we will play with a few possibilities:

1) declare an "input schema" - this would effectively provide a list of all the queryable properties. The "prepared statement placeholder" effectively maps to a simple input feature where each placeholder parameter can be described by a feature property with a single, mandatory value (minOccurs=1, maxOccurs =1). Additional parameters that you may meaningfully constrain the result set with may be included (minOccurs=0, maxOccurrs=1)

This could be implemented in a related schema - so the workflow would be: find service A that offers Feature Type X

  • find schemas that service A offers to query against Feature Type X

or

  • find schemas registered in a catalog as being expected to be supported against X

or

  • find "binding" objects that allow the query to be derived - eg a Filter document template.

or

  • allow a user with magic knowledge to specify the paramters in a UI and save these in a "context"

we can support any or all of these in WebMap Composer - still thinking through which makes most logical sense - they all bury semantic knowledge in a different part of the process.

-- RobAtkinson - 19 Oct 2004

I think the first two options would be prefereable in the long run as they offer the most open solution. In one sense it's an extension of the Requests that the WFS spec currently outlines, that is in addition to handling a getfeature, describefeaturetype, getcapabilities, getlock and getfeaturewithlock we should be able to handle a describefeaturetypequery (or describefeaturetypefilter if you want to remain true to the "one big XML object" model).

The last option is the one that I would imagine would provide the quickest solution and is probably most practical for demonstration purposes as people building demonstrators will have the "magic knowledge" neccessary.

-- StuartGirvan - 22 Oct 2004

Is the intention that the query-schema/template might include properties that do not exist in the result?

-- SimonCox - 20 Oct 2004

Yes - absolutely - they may relate to properties of related features or calculated values.

e.g. show me sampling sites where pH was measured needs to query the measurements, but returns only sites.

Also, binding of properties to vocabularies cant be derived from the result schema either.

-- RobAtkinson - 20 Oct 2004

Non spatial DB support (mapping columns to geometry components)

Geoserver support for non-native geometries now available - works as before except you must declare the mapping in the datastore definition, and currently the last two columns of the table/bypassSQL result set must be the x,y columns.

This new capability works with or without the new schema mapping capabilities.

For configuration instructions see GeometrylessJDBCDataStore

-- RobAtkinson - 07 Oct 2004

Mapping to target output schemas

Current state of play (major enhancements to WFS referenec implementation)
  • A query can be specified to extract result set that embeds a single master/detail pattern
  • the multi-valued "detail" property must be the last property in the XML feature encoding
  • result columns can be mapped to elements or attributes with any namespace

It is now possible to sub nest (or "push down") a non repeating leaf element. This is done using a "->" instead of the "/" path delimiter in the xpath attribute in schema.xml, some examples follow

xpath="/Geochem3/Location->PlaceName"

will generate output like

        <sco:Geochem3>
            <sco:SiteNo>25026</sco:SiteNo>
            <sco:Location>
                <sco:PlaceName>Ringneck</sco:PlaceName>
            </sco:Location>
            <sco:LocationMethod>1:100 000 topographic map (AMG66)</sco:LocationMethod>

There is no limit on the number of ->, ie a non repeating element can be nested to an arbitrary depth. Attributes are OK, eg

<xs:element name="LocationName" type="xs:string" dbJavaType="java.lang.String" nillable="true"
    minOccurs="0" maxOccurs="1" xpath="/Geochem3/zig:Location->zag:Textual@Name"/>
<xs:element name="LocationMethod" type="xs:string" dbJavaType="java.lang.String" nillable="true"
    minOccurs="0" maxOccurs="1" xpath="/Geochem3/zig:Location->zag:Textual@Method"/>

produces

<zig:Location>
    <zag:Textual Name="Ringneck" Method="1:100 000 topographic map (AMG66)"/>
</zig:Location>

The reason for the syntax change ("->") is that it is a convenient means of indicating a non repeating element, which internally is treated as a leaf node of Geochem3, preserving the sequence of elements and attributes as enumerated in schema.xml

Unfortunately it is not possible to aggregate elements using this mechanism; it would take a lot of effort to remediate - nested elements are assumed to be able to repeat; they are always written last (may change sequence order) and currently there can only be one repeating element (that may or may not nest sub elements) on any given nesting level - see note below.

eg
xpath="/Geochem3/zig:Location->zag:Textual"
xpath="/Geochem3/zig:Location->zag:Spatial"

produces

<zig:Location>
    <zag:Textual>Ringneck</zag:Textual>
</zig:Location>
<zig:Location>
    <zag:Spatial>1:100 000 topographic map (AMG66)</zag:Spatial>
</zig:Location>

Important Note

In the current implementation, within a nesting level, there can only be one element that may repeat. This element must follow non repeating leaf elements of the same parent in schema.xml. It is not possible to support schemas that do not follow this sequence, eg

The following is not possible

<sco:Geochem3>
    <sco:SiteNo>85761</sco:SiteNo>
    <sco:Location>
        <sco:PlaceName>CSD120,tray 5,84.05-84.90m (drill hole)</sco:PlaceName>
    </sco:Location>
    <sco:Result gml:id="241827"/>
    <sco:Result gml:id="241828"/>
    <sco:Result gml:id="241829"/>
    <sco:Result gml:id="241830"/>  
    <sco:LocationMethod>surveyed from ground control</sco:LocationMethod>
    <sco:Latitude>28.813022</sco:Latitude>
    <sco:Longitude>122.423331</sco:Longitude>
    <sco:Datum>UNKNOWN</sco:Datum>
    <gml:location/>
</sco:Geochem3>

It can only be supported as

<sco:Geochem3>
    <sco:SiteNo>85761</sco:SiteNo>
    <sco:Location>
        <sco:PlaceName>CSD120,tray 5,84.05-84.90m (drill hole)</sco:PlaceName>
    </sco:Location>
    <sco:LocationMethod>surveyed from ground control</sco:LocationMethod>
    <sco:Latitude>28.813022</sco:Latitude>
    <sco:Longitude>122.423331</sco:Longitude>
    <sco:Datum>UNKNOWN</sco:Datum>
    <gml:location/>
    <sco:Result gml:id="241827"/>
    <sco:Result gml:id="241828"/>
    <sco:Result gml:id="241829"/>
    <sco:Result gml:id="241830"/>  
</sco:Geochem3>

Results as of 28th October at GA

Attached is an XML file which I think conforms to the Geochem Schema GeochemSM.xml.

Also here are the info.xml and schema.xml files used by Geoserver to do it.

-- StuartGirvan - 28 Oct 2004

Great - I just see the following minor errors:

  1. <xmml:place-><gml:LocationString> should be just <xmml:place> (no dash)
  2. <gml:location ... /> is deprecated in GML 3.1 and so is not used in XMML - please use <gml:position ... />
  3. quantities that are ratios (xmml:resultMeasureExact, xmml:lowerDetectionLimit) should be transformed to a "pure number" - i.e. the units of measure should be "1"
  4. xmml:Responsible should be xmml:responsible (character case)
  5. xmml:Analyte (2 places) should be xmml:analyte (character case)
  6. xmml:analyte should be after xmml:procedureUsed
    • while this is unimportant for this project, XML Schema validation requires that the order of items in a sequence be preserved (this is partially a hangover from the document origins of XML, and also due to the practicalities of validatable data structures)
  7. if possible, coordinates shoudl be changed to pos, though I understand that this is a Geoserver GML2/GML3 thing

-- SimonCox - 04 Nov 2004

  1. Done
  2. Done
  3. Still working on this
  4. Done
  5. Done
  6. Understood but see Pete/Rob's note above about possible nesting and orders.
  7. Don't think it's going to happen as part of this project

Here's the latest example attempt GeochemSM2.xml.

-- StuartGirvan - 09 Nov 2004

Questions from Pirsa

TerryHannant and MarkJolly have been looking at your example output and working out how to adapt it to PIRSA's data (also the corresponding info.xml and schema.xml)

It would help if you could clarify the following questions:

1 Have you dealt with the issue of downhole samples at all? The only examples I have seen of sample depth have been added to the location PlaceName - e.g. <sco:PlaceName>CSD120,tray 5,84.05-84.90m (drill hole)</sco:PlaceName>

2 Why is the element name "related" used but not displayed? Is this a requirement of the model - or a peculiarity of your data?

3 Why is the element xmml:position xlinked from within GeochemMeasurement - the actual coordinates do not seem to be present in this document. Is this structure the preferred/normal way to do this in XMML?

4 What does this refer to: <xmml:subject xlink:href="../../.." /> This does not seem to refer to an element.

5 In an earlier teleconference, it was indicated that fully normalised chem values were preferred (ie. a number from 0 to 1 indicating concentration). The original unit of measure would then be kept as a "quality indicator". Do you intend to move to this approach? If so, where would the unit of measure be placed?

6 What is the meaning of this: <xmml:procedureTime xlink:href="urn:ogc:items:exceptions:unknown" /> Is this just a more complete way of saying that the date/time that the assay procedure was undertaken is unknown?

7 Does your data always indicate a lower detection limit? Our data generally only indicates the lower detection limit, when the actual value detected is less than this limit. The detection limit is not always directly linked to the method used, as this may change for the same technique used with different instruments at different labs.

8 Our data includes the person who collected the sample, and the date the sample was collected. Where in your model would this data fit?

I am well aware that some of these questions indicate a lower level of understanding on my part than is desirable, but any help you can provide would be gladly accepted.

Responses

1 All measurements are related to a specimen. So it is the specimen description that will capture the downhole aspects. in particular, note that gml:pos (and the GML2 gml:coordinates) support 3-D positions. So if the "z" value is smaller than the z value of the local ground surface, it was sampled from the sub-surface. -- SimonCox - 09 Nov 2004

I haven't come accross any examples with the third dimension yet and I'm not sure if Geoserver handles them. It would be worth checking out the Geoserver documentation. I'll see if I can find an example in our database and get bak to you -- StuartGirvan - 09 Nov 2004

2 In a couple of places I've had to select out the same field twice in the info.xml file to be able to generate the following types of relationships in XML:
Eg 'Related Observation'||ar.resultno related, ar.resultno ResultNo, are used for ....
<gml:location />
  <xmml:relatedObservation>
    <xmml:GeochemMeasurement gml:id="101232">
.....
m.methodno procedureUsed, m.methodno AssayProcedure, for .......
<xmml:procedureUsed>
  <xmml:AssayProcedure gml:id="2">
    <xmml:name>AAS</xmml:name>
.......
and ap.analysis_id AnalysisID, ap.analysis_id AnalysisSensitivity, for ......
<xmml:analyteDetails>
  <xmml:AnalyteSensitivity>
.......
You need to do it this way because it needs to be able to "break" (nest?) twice or start two child branches based on a new occurrence of one of those values to conform with the examples. -- StuartGirvan - 09 Nov 2004

3 When the positional information is available it should be used. When the positional information is not available then the gml:position will have a value that indicates this. Not sure why Stuart does not have a value. -- SimonCox - 09 Nov 2004

For some reason the first specimen didn't have a location but if you look at the second specimen in the example it has a location and the xmml:position xlink directs you to the location details.-- StuartGirvan - 09 Nov 2004

4 <xmml:subject xlink:href="../../.." /> points to the great-grand-parent of the subject element - i.e. the xmml:GeochemSpecimen SimonCox - 09 Nov 2004
Or in slightly more plain English it points to the subject of the Measurement (ie the Specimen).-- StuartGirvan - 09 Nov 2004

5 see note to Stuart above SimonCox - 09 Nov 2004

6 Yes. The URN is a "typed exception". SimonCox - 09 Nov 2004

7 Where should detection thresholds be recorded? In the model expressed by the schema I had proposed, the detection threshold is a property of the "assay procedure" - see XmmlSchemaRepository:trunk/XMML/geochemistry.xsd. Note that the description of the assay procedure binds a method (such as ICPMS, XRF, etc) to a set of analyteDetails each of which binds an analyte to a detection threshold. This was intended to support the (I thought common ...) situation where a standard procedure is used by a lab, which has varying detection thresholds for different analytes. This can be recorded in one place a linked to many measurements. I'm now hearing that for you guys these details may be unique for every measurement. That is OK - you can define a specific AssayProcedure for each measurement, and probably record it inline, as Stuart shows (though in the example from Stuart the description clearly does not match the method ...). Note that, as per the schema, the analyteDetails block is optional. But I would assert that detection threshold is logically part of the assay procedure. So I think the encoding is not wrong. SimonCox - 09 Nov 2004

Our Detection limits are linked to a combination of Analytes and Methods and Laboratories in quite a complex (and not neccessarily correct) way, but basically our data will always have a lower detection limit. The detection limit I'm showing for analytes are thus a combination of detection limits on a given species from a given laboratory and given method (but not at a particlar time which is a shortcoming) -- StuartGirvan - 09 Nov 2004

8 This is part of the specimen description - see XmmlSchemaRepository:trunk/XMML/geochemistry.xsd and XmmlSchemaRepository:trunk/XMML/commonFeatures.xsd. A Specimen can have any number of processingStep properties, each containing, or pointing to, a ProcedureEvent - see XmmlSchemaRepository:trunk/XMML/0.9/procedures.xsd. A ProcedureEvent could record the "sampling" event. The ProcedureEvent has a eventTime and responsible (Party) which I think can be used to capture the information that you have.

Now, the next question is whether that kind of information should be published on the WFS interface ... just because it is recorded in your database, and the XMML does support it, does not mean that you have to emit it in your data product. That is a policy decision. -- SimonCox - 09 Nov 2004

And my gut feeling about that is not to bother for the purposes of this demonstration project as we have so many other bits and pieces to sort out. Also who collected the sample and when are secondary data in the context of the actual geochemistry measurements. -- StuartGirvan - 09 Nov 2004

Comments from PIRSA largely in response to Simon's comments:

1. The approach for outputting measurements less than detection limit had the following response:

Instead of
<resultMeasureExact> 
use the
<resultMeasureLessThan> 
element to report the result. These are both in the
<abstractResultMeasure> 
substitution group.

There appears to be no mechanism in info.xml to allow for one element in one case, and another in another depending on the data (although Terry Hannant is looking at possibilities)

For the present we will retain the existing approach - as per Stuart Girvan's XMML.

Technically I suspect this is a very tricky problem, but one to keep on our wishlist.-- StuartGirvan - 19 Nov 2004

This formulation was put up as a proposal. Would something like <resultMeasure indeterminateValue="lessThan" uom="ppm{wt}">1.0</resultMeasure> make life easier? -- SimonCox - 22 Nov 2004

2. The comments on the mapping of PIRSA Sample_no follows Stuart Girvanís approach:

e.g. In Stuart's output: r.rockno is mapped to GeochemSpecimen in info.xml, which in turn is mapped to xpath /xmml:GeochemSpecimen@gml:id

Our sample_no is an internal primary key, as well as being the user identifier for samples at PIRSA. Note that we also have a company/original sample ID: sm_sample.other_sample_no.

Other_sample_no is implemented in gml:name, together with collected_by and collection_date.

Given the current overloading of gml:name, it would seem better to keep sample_no separate in gmd:id

Rockno is our internal primary key and our sampleid sounds like it is equivalent to your other_sample_no. I've used the rockno as the gml:id of the specimen because it is the unique identifier. So yep sounds good to me. -- StuartGirvan - 19 Nov 2004

gml:id corresponds approximately with a primary key. gml:name corresponds with an identifier that may be used externally. If your system makes the value of these identical, then it would be good practice in the XMML to put it in both places. -- SimonCox - 22 Nov 2004

3. Lookup tables

There are a number of lookup tables under https://www.seegrid.csiro.au/subversion/xmml/trunk/enumerations/

There are many items that may need to be maintained on an ongoing basis, and/or be maintained locally for each state.

e.g. LUTgeochemistry.xsd contains a list of elements and oxides etc.

How are these tables managed? - e.g. as different surveys encounter additional values. Is it intended that these tables will be managed centrally on an ongoing basis?

How are these tables referenced?

Until these issues are resolved, our best option is to include code and description in the values we use, unless these are very clearly exactly the same list as one of the SEEGrid lookup tables.

The expectation is that these schema lists would be managed centrally on an ongoing basis. That is part of the idea of SEEGRID, ie to provide a centrally managed set of services and standards (a registry in the web services sense) for interoperability for the Geoscience community. The corollary of this is that if you want to have input you have to be prepared to contribute your time if Simon or whoever is managing the schema has queries
You dont actually need to reference these schemas as such (although you can if you really want to), just make sure that your values match those values in the lookup tables (schema enumeration lists is technically a better way of putting it) because (in theory) they are supposed to be what our XML output will validate against. What you actually mange locally on an ongoing basis is the mapping (if required) from your lookup lists to those supplied on SEEGRID.
For the purposes of the demonstrator though all you need to do is make sure that the elements and oxides match the enumeration list, which I would imagine for the moment shouldn't require too many mappings (he said hopefully). -- StuartGirvan - 19 Nov 2004

4. Links to Borehole feature for drillhole details

It seems logical to do this, but this would be moving much beyond Stuart Girvanís example - with the associated delay and possibilities of ďbreakingĒ the model.

It is still not clear how this would solve the sample depth problem, as this is not really a borehole attribute.

Yep I would think we ignore depth for the moment. If we get a chance to revisit over the next couple of months then we should have a go. -- StuartGirvan - 19 Nov 2004
An xmml:Specimen only has a point-location, though this can be 3-D. If we need a feature-type that allows for specimens with non-point geometry, then this needs to be added to the to-do list. -- SimonCox - 22 Nov 2004

5. Collected by and date as attributes of a sample.

Simon suggests that we use xmml:ProcedureEvent in

https://www.seegrid.csiro.au/subversion/xmml/tags/200501/XMML/procedures.xsd

This would complicate our approach, and once again would not be following Stuartís example.

Again I think we should keep it as simple as possible for the moment. The main thing that we want to be able to demostrate is that "here is a set of geochemistry data coming from GSWA, PIRSA and GA in real time" and demonstrate all the other benefits. We can always say "and in x months time we intend on extending our published datasets to include x features" and include things like the date the sample was taken and who it was collected by. -- StuartGirvan - 19 Nov 2004

-- GregoryJenkins - 19 Nov 2004

Topic attachments
I Attachment Action Size Date Who Comment
FeedbackonGeoserverextensions.docdoc FeedbackonGeoserverextensions.doc manage 42.5 K 27 Sep 2004 - 08:20 StuartGirvan  
FeedbackonGeoserverextensions2.docdoc FeedbackonGeoserverextensions2.doc manage 61.0 K 01 Oct 2004 - 11:36 StuartGirvan  
GeochemMeasurement.xmlxml GeochemMeasurement.xml manage 17.2 K 27 Sep 2004 - 08:19 StuartGirvan  
GeochemMeasurement3rd.xmlxml GeochemMeasurement3rd.xml manage 24.7 K 01 Oct 2004 - 11:34 StuartGirvan  
GeochemSM.xmlxml GeochemSM.xml manage 102.0 K 28 Oct 2004 - 12:17 StuartGirvan  
GeochemSM2.xmlxml GeochemSM2.xml manage 102.0 K 09 Nov 2004 - 12:57 StuartGirvan  
GeochemSpecimen.xmlxml GeochemSpecimen.xml manage 4.1 K 27 Sep 2004 - 08:19 StuartGirvan  
GeochemSpecimen3rd.xmlxml GeochemSpecimen3rd.xml manage 8.2 K 01 Oct 2004 - 11:35 StuartGirvan  
GeoserverPatchSet2004-09-30.htmlhtml GeoserverPatchSet2004-09-30.html manage 11.8 K 30 Sep 2004 - 09:51 RobAtkinson Release notes for updated builld
MultipleSiteAgGeneric.xmlxml MultipleSiteAgGeneric.xml manage 6.4 K 27 Sep 2004 - 08:20 StuartGirvan  
SingleSiteAllGeneric.xmlxml SingleSiteAllGeneric.xml manage 12.9 K 27 Sep 2004 - 08:20 StuartGirvan  
geoserver-2004-09-30.tgztgz geoserver-2004-09-30.tgz manage 18.9 K 30 Sep 2004 - 09:57 RobAtkinson updates to source code
geosfeedback.pptppt geosfeedback.ppt manage 138.5 K 19 Oct 2004 - 09:47 StuartGirvan  
info-gm.xmlxml info-gm.xml manage 1.5 K 27 Sep 2004 - 08:29 StuartGirvan  
info-gs.xmlxml info-gs.xml manage 1.5 K 27 Sep 2004 - 08:29 StuartGirvan  
info.xmlxml info.xml manage 2.3 K 28 Oct 2004 - 12:17 StuartGirvan  
schema-gm.xmlxml schema-gm.xml manage 2.1 K 27 Sep 2004 - 08:29 StuartGirvan  
schema-gs.xmlxml schema-gs.xml manage 2.2 K 27 Sep 2004 - 08:29 StuartGirvan  
schema.xmlxml schema.xml manage 6.6 K 28 Oct 2004 - 12:19 StuartGirvan  
Topic revision: r34 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).