Geospatial information services: interoperability considerations
Interoperable services deal in standard information types encoded in standard formats
The case for community schemas
More than one organisation may have a database for information which is conceptually
the same (e.g. "ore-deposits"). However, differences in the organisational requirements, or maybe just arbitrary historical reasons, may mean that the private RDBMS schema is different. The "automatic" XML representations of data from these sources will therefore be different, even though the information is logically the same. (They may even all be valid GML, but not using the same application schema.)
Thus, despite the use of XML as an explicit, neutral interchange format, we will not achieve interoperability, such that one data source can be replaced by another with no change to the consuming application.
This situation motivates the case for decoupling the definition of the XML interchange format from any one specific source. Rather, there must be agreement, within a particular information community, on a language to be used for information exchange within that community. For geospatial data, where the XML document is generated as a response to a WFS #GetFeature
request, it should use a GML Application language such as XMML, defined by a GML application schema.
Implications of using a community-defined GML Application Language
#GmlImplications However, this also means that (in general) it is not
possible to use XML generated using the simple mapping from a SQL source to implement a WFS that uses a community specified GML language. Rather, the information from the SQL source must be converted into GML following an explicit analysis of the mapping between the RDBMS schema and the GML application schema.
At first glance this is rather disappointing, since it clearly increases the burden of WFS implementation. However, short of requiring all information providers to use an identical RDBMS schema on the back-end, it is the price that must be paid for interoperability.
Note that it also significant security advantages. #WfsSecurity
Public and private data models
In general, the mapping between the public (GML) and private (e.g. a table) representations is not trivial. The table schema will normally have been designed with the custodian's business requirements in mind (e.g. workflow, security, performance), which will often be more complex than is required by the public view. Furthermore, the tables will normally not be organised to match the complexContent models found in the feature type, and even the columns in the tables will not necessarily have the same labels as Feature properties. Some Feature properties may not be directly represented in the private representation (e.g. bounding box) and may need to be computed from the raw data at the time of request.
If a single data provider is dominant in the market, then the community information model may be based on their internal schema. However, in general more than one organisation will have a database for information which is "conceptually" the same (e.g. "ore-deposits"), but differences in the organisational requirements, or maybe just arbitrary historical reasons, will mean that the (e.g.) RDBMS schema does not correspond directly to the community feature model. In general it is desirable to capture a broader, community-based consensus concerning the Domain of Discourse
in a public model and encoding - this is the assumption underlying ISO 19109 "Rules for Application Schema". This will maximise interoperability between many potential information servers and clients.
We define full information-service interoperability as a deployment that allows a non-trivial client application to bind to multiple replaceable data services without re-configuration. This requires that every conformant data service publishes the same data product, defined in syntactic, structural, and semantic terms.
In the case of WFS, this means a service profile which specifies
- the common GML Application Schema,
- value-space/domain (vocabularies) bindings for all properties having literal values
- standard query patterns
On the latter: the generic WFS pattern is that the feature-type defines the query structure expressed as a "filter". For various reasons queries over all possible combinations of feature properties may not be achievable or desirable. Thus, it is often necessary to define a more limited set of queries that are supported by conformant services.
For a WFS deployed over an existing data-store or other data-source, use of a common schema, vocabularies and queries is likely to require a schema translation
step to convert the private representation to the public data product, for both the Request (incoming) and Response (outgoing).
Locating the schema translator
Schema translation can be accomplished at several positions within an architecture or processing chain. Most of these have been tried.
- on the client using data received from a service structured according to a private model, but using a standard syntax (e.g. CSV, GML)
- i.e. the traditional import approach seen in most desktop clients
- however, interoperability is not automated, since import is followed by a "semantic" interpretation step when a human user inspects the field names and maps them to their target data structure or portrayal
- do the mapping in a proxy-WFS, which acts as a client to the base service, but presents a standards-conformant service-interface outwards
- two example implementations are known
- Cocoon (XSLT) over ArcIMS (used by most contributors to TestBed2 and TestBed3)
- Ionic proxy-WFS (Java) over a raw WFS
- this is an example of a mediator architecture
- the mediator may be installed either
- directly on top of the service, at the server so the composite component presents a conformant public interface, or
- deployed as a separate service
- on the server, prior to the WFS server
- three variations are common:
- replicate the source data into a physical data-store (e.g. an ESRI GeoDatabase) in which the table structure matches the public schema - i.e. schema-translation into tables prior to the WFS layer, either one-off or using a cron-job
- a variation on this is to replicate the source data into a physical data-store in the form of an XML database with the public schema (this is/was the strategy used by Galdos in their Cartalinea/Cartegna products)
- replicate the source data into a "virtual" data-store in which the table structure matches the public schema - Oracle "materialized views" is a well-known solution that can be used for this
- this fulfills the requirements of most "commercial" WFS implementations, where the GML Schema for responses is derived directly from the structure of the storage layer
- this is a kind of forward-caching strategy.
- dynamically, in the server software
We need to flesh out the analysis of these strategies and write up the pros-and-cons of all.
- 25 Sep 2006, 2009-08-14