"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

!! GeoServer Feature Chaining Plan

Contents

Introduction

The GeoTools app-schema module provides application schema support for GeoServer. AppSchemaDataAccess (formerly known as ComplexDataStore) maps "simple" features obtained from data stores into a deeply nested object structure that can be used to encode XML described by an application schema.

All the work proposed in this plan will be done on GeoTools trunk, inside the app-schemas module.

Integration with GeoServer is not yet working - assumptions about DataStore and SimpleFeature legacy implementations need to be systematically relaxed to support more general types. This is planned as a separate work.

Motivation

  • Grouping of a denormalised view was used in GeoServer 1.6.x / GeoTools 2.4.x "community-schemas" to implement a single multi-valued property. Use cases require multiple multi-valued properties in a single feature. Grouping is not suitable for this, so has been removed. Feature chaining is the proposed replacement.
  • Configuration is monolithic and complicated. For example, if the mapping for type A has nested type B, and the mapping for type C has the same nested type B, the configuration of type B would be duplicated. Feature chaining should simplify this.

Example of multiple multi-valued properties

GeoSciML

The GeoSciML Testbed 3 instance document contains
  • gsml:MappedFeature with gml:id="mf.25114"
which has
  • gsml:specification of gsml:GeologicUnit with gml:id="gu.26932914"
and this has
  1. multi-valued gsml:composition/gsml:CompositionPart
  2. multi-valued gsml:part/gsml:GeologicUnitPart
so requires support for multiple multi-valued properties.

Observations and Measurements - timeseries

In time-series observations we create a "SampingFeature" under the O&M model then specialise it to create multiple views:

1) a simple "dot on a map" locator 2) a container for an archive of relatedObservations 3) one or mare statistical summaries of what has been sampled and how much sampling has taken place

in each of these cases most of the properties of the sampledFeature are common, and a common configuration would not only make this easier, but enforce the contract that gml:names are consistent across the multiple representations of the same samplingFeature.

In addition, Observation features referenced by the samplingFeature/relatedObservation/Observation* property could be directly accessed independently of the sampling feature, to provide an ability, for example, to find all the observations meeting some criteria.

A final twist is the potential to include related sets of observations in a single response - this is a multi-valued property aggregtaed from multple sources, currently unsupported by the grouping iterator strategy.

Goals

Must have

  • Implement feature chaining, where features nested inside other features are configured separately, and retrieved as a separate query.
    1. Because the query of the nested feature can return multiple features, it can be a multi-valued property.
    2. There can be an arbitrary number of these.
    3. Feature chaining thus supports multiple multi-valued properties.
  • Separate configuration of nested features to simplify configuration, reduce the size and depth of configuration XPaths, and reduce duplication.

Desirable

  • It is desirable that future configuration formats support imports (or even automatic inheritance from base feature types triggered by the schema information), to reduce duplication, and allow extension of templates or profiles.
  • Support for variable expansion (e.g. Java system properties) is desirable.

Issues

  • GeologicUnit/composition is a CompositionPart that ha a proportion that describes its relation to the enclosing GeologicUnit. Can it still be described as a feature? Do we need pseudofeatures (fake features) to support non-feature properties?
  • Any implementation must support filter queries that operate on nested types. For example, we may want MappedFeature where MappedFeature/specification/GeologicUnit/composition/CompostionPart contains basalt, and CompositionPart is chained inside GeologicUnit, which us chained inside MappedFeature.
  • Do chained features have distinct DataStore connections to a database, or can they share?
    • What are the resource exhaustion issues?
    • Does chaining use subqueries or joins? Should this be a implemented using a Strategy pattern?
  • How are features (properties) related to an enclosing feature? How is this relationship specified in the mapping file?
    • If feature chaining occurs as a separate query in the application schema language, the relationship will need to refer to a property of the chained feature in the application schema. That is, the chaining would not be able to refer to underlying properties provided by the source DataStore, only those that have been mapped.
  • Can we support multi-valued simple properties? For example, one GeologicUnit might have one gml:name, another might have 15. This would be data driven. We could use a pseudofeature to get the names, and inline them as attributes when building the feature object.
    • This approach might be needed if we are using application-schema identifiers for recording relations between features. For example, we might need to have a magic codeSpace with 17 names by which a GeologicUnit is known, to list the MappedFeature that contain it.

Work outline

  1. Make a new MappedFeature unit test similar to GeoSciMLTest (could be called FeatureChainingTest).
  2. Make a new GeologicUnit unit test similar to GeoSciMLTest. This ensures we have a working GeologicUnit configuration.
  3. Modify the mapping implementation to support nesting a GeologicUnit as the specification of a MappedFeature. Ensure existing unit tests still pass.
  4. Modify the MappedFeature unit test so that the GeologicUnit is nested within as the MappedFeature as its specification. The unit test should test:
    • GeologicUnit properties are present nested in the MappedFeature.
    • Filter query can be made on MappedFeature based on GeologicUnit properties.

Implementation options

Option 1: relations are expressed in the application schema space

  • Each feature type is mapped independently from a database view into a GeoAPI object structure.
  • Nested features are specified by referring to a property that can be used in a query to determine the related features of another type.
  • Because nested features are mapped and accessed as GeoAPI trees, the database view columns of a nested feature are not available to the enclosing feature. This necessitates adding extra properties to the nested feature to allow a query to be constructed to find all the related features. For example, we might need to add a gml:name with a magic codeSpace to a CompositionPart so that it can be located when assembling a GeologicUnit. This might be tricky when we have types that are nested in multiple features.

Option 2: relations are expressed in the database view space

  • Target feature type is mapped from a database view, and nested feature types are mapped in the same file from other database views.
  • Nested features are related by specifying database view column names.
  • Because nested features are mapped in the same file as the target feature, queries can be constructed on the view column names to assemble the target feature. There is no need to add extra properties to features.

Drawing of feature chaining options

Comments

Please see InfosrvicesGeoserverFeatureChainingUserGuide for option 1 implementation details.

-- RiniAngreani - 09 Feb 2009
 
Topic attachments
I Attachment Action Size Date Who Comment
FeatureChainingOptions.svgsvg FeatureChainingOptions.svg manage 21.9 K 21 Nov 2008 - 16:09 BenCaradocDavies Drawing of feature chaining options
Topic revision: r11 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).