Web Feature Service
OGC-WFS provides a standard http request syntax for fine-grained data access.
The main data-request operation is GetFeature
. It is parameterized by
- feature-type (choose from a list in the GetCapabilities response)
- a projection model, being the feature-properties that should be reported (usually a subset of the properties of the feature-type described in an XML Schema obtained through the DescribeFeatureType operation)
- selection criteria (expressed as comparison operations against the feature properties)
The response is a collection of feature instances
, in an XML document conforming to a GML Application schema
Note that spatial selection is expressed as a filter operation with respect to any spatial property of the requested feature-type.
A OGC-WFS may optionally also support Transaction
which allows insertion/update/deletion of feature data.
The feature-types served by a particular OGC-WFS may be unique to that OGC-WFS. However, for interoperability purposes it is recommended that a OGC-WFS publish its data using standard feature-type definitions defined in a community application schema. That may require a transformation from a private data model used in the data source.
The OGC-WFS specification is available from http://www.opengeospatial.org/standards/wfs
. For a more detailed discussion, see WebFeatureService
Scope of WFS
The scope of WFS is features
It provides a method for addressing information objects packaged at this level of granularity.
Requests on a WFS are restricted to features of the type(s) listed in its capabilities statement.
The result of a #GetFeature
request is packaged as an XML document.
The content model is primarily defined by an XML Schema describing the feature type requested, and secondarily by the properties selected (from amongst the optional properties) in the <Query> element within the GetFeature request.
The feature instances that are reported are selected by tests on values within their properties, described using a <Filter> element within the GetFeature request.
These may include spatial queries.
Effectively, the WFS instance could be thought of as a (set of) virtual XML document(s) (one for each feature type).
A successful query simply results in a valid extract
, consisting of a FeatureCollection containing features of the requested type that satisfy the filter.
What WFS is not
WFS is not a general purpose database access method. Ad-hoc queries, data-mining, reports and summaries are not supported on the server side.
Ergo: WFS is the safe way to expose geographic data on the Web
The implication of the above two perspectives is that a WFS which correctly implements a known community GML schema can be used to provide a safe facade to a database, where the database may privately contain other information not intended for end users. This may be sensitive fields, types of queries or even proprietary database schema designs. There is always a temptation to expose proprietary out-of-the-box interface protocols. However these raise significant issues regarding how much you really want to expose to the world. A well configured WFS is a safe, limited service.
More work is required to formally describe additional query limitations, however there is no reason these cant be enforced at the WFS and described in service metadata and schema annotation.
The WFS protocol
Concept of operations
The OGC Web Feature Service (WFS)
interface is a collection of operations, implemented as messages carried over http, which enable access to features
There are four basic "operations" in WFS, as shown in the sequence diagram below.
- Protocol diagram for WFS:
Note that although referred to in OGC discussions as "operations", each is actually implemented as a pair of messages
, corresponding to a request and response.
- GetCapabilities requests a basic description of the service instance. The service responds with a Capabilities document that identifies the service instance, summarises the requests that this service can handle, and lists the feature types that it can report
- DescribeFeatureType requests a detailed description of specific feature types. The service responds with XML Schema documents specifying the XML encoding for the feature type. In principle, this allows the client to infer the format of an XML encoded representation of the feature, which it may use (for example) to initialise storage, or to construct the details of a valid GetFeature request.
- GetFeature requests the digital representation of specific feature instances. The request specifies
- the feature type(s) of interest,
- conditions to select the set of instances,
- the subset of properties that should be included in the response.
The service responds with a FeatureCollection containing the requested features.
- Transaction allows the client to perform insertion and update operations on the data source. In the context of the pmd*CRC DataBases project we are not currently interested in this and it will not be discussed further here.
Note that the #GetFeature
operation makes use of the OGC Filter encoding
. It does not
use any of the more general proposed/standard XML query syntaxes. This is primarily for historical reasons (WFS was under development before XQuery
etc), but also because the OGC Filter has support for certain specialised spatial operators (the so-called Egenhoffer operators). Consideration is being given to migrating to XQuery
, but this would be a big change to the spec.
The purpose of WFS is to provide a web (http) �fašade� or wrapper around a data service, with a standardised request and response syntax.
Behind the fašade, the source of the data may take many forms
- a GIS (e.g. ArcGIS)
- a relational database (e.g. MySQL, Oracle, SQL Server)
- an XML database (e.g. Oracle 9i, Virtuoso)
- another XML document store (e.g. the file system)
- an Object database (e.g. FracSIS)
- a live source, such as a sensor or instrument
The details of the data source is, however, assumed to not be of immediate interest to the client, who merely wants to request or operate on features.
The main tasks of the WFS software, therefore, are to:
- translate the WFS request (in particular, #GetFeature) into suitable operations to access the data source (e.g. SQL), and then
- retrieve the required information and serialise this in a valid WFS response document, in particular the result of a #GetFeature operation is a GML FeatureCollection.
The response is in terms of a public
view of the information, according to a GML application language which is a representation of a community model.
It is assumed that there is a straightforward mapping from the custodian's private data model, but the latter is of no direct interest to the user.
The intention is that the same XML representation of semantically equivalent objects should be available from a variety of sources, each having potentially different private datamodels.
Alternatively, a single data source might publish views according to a variety of GML Application Languages, each of which might emphasize a different aspect of the same information.
For a discussion of relationships between private and public (GML) views of data, see SchemaMapping
Open- vs. closed-world assumptions
Within a GML instance, the value of a property may be given using a reference to a resource, denoted by a URI - see GmlImplementation
This may be a network-accessible resource or an intra-document link.
The fact that the relationships described in a document can cross the boundary of the source datastore is a key difference between web-hosted information and conventional closed-world databases.
Scalability requires that the referring application is tolerant of the possibility that the target resource is not available, or uses an unknown encoding.
The case for community schemas
More than one organisation may have a database for information which is conceptually
the same (e.g. "ore-deposits"). However, differences in the organisational requirements, or maybe just arbitrary historical reasons, may mean that the private RDBMS schema is different. The "automatic" XML representations of data from these sources will therefore be different, even though the information is logically the same. (They may even all be valid GML, but not using the same application schema.)
Thus, despite the use of XML as an explicit, neutral interchange format, we will not achieve interoperability, such that one data source can be replaced by another with no change to the consuming application.
This situation motivates the case for decoupling the definition of the XML interchange format from any one specific source. Rather, there must be agreement, within a particular information community, on a language to be used for information exchange within that community. For geospatial data, where the XML document is generated as a response to a WFS #GetFeature
request, it should use a GML Application language such as XMML, defined by a GML application schema.
Implications of using a community-defined GML Application Language
#GmlImplications However, this also means that (in general) it is not
possible to use XML generated using the simple mapping from a SQL source to implement a WFS that uses a community specified GML language. Rather, the information from the SQL source must be converted into GML following an explicit analysis of the mapping between the RDBMS schema and the GML application schema.
At first glance this is rather disappointing, since it clearly increases the burden of WFS implementation. However, short of requiring all information providers to use an identical RDBMS schema on the back-end, it is the price that must be paid for interoperability.
Note that it also significant security advantages. #WfsSecurity
Public and private data models
In general, the mapping between the public (GML) and private (e.g. a table) representations is not trivial. The table schema will normally have been designed with the custodian's business requirements in mind (e.g. workflow, security, performance), which will often be more complex than is required by the public view. Furthermore, the tables will normally not be organised to match the complexContent models found in the feature type, and even the columns in the tables will not necessarily have the same labels as Feature properties. Some Feature properties may not be directly represented in the private representation (e.g. bounding box) and may need to be computed from the raw data at the time of request.
If a single data provider is dominant in the market, then the community information model may be based on their internal schema. However, in general more than one organisation will have a database for information which is "conceptually" the same (e.g. "ore-deposits"), but differences in the organisational requirements, or maybe just arbitrary historical reasons, will mean that the (e.g.) RDBMS schema does not correspond directly to the community feature model. In general it is desirable to capture a broader, community-based consensus concerning the Domain of Discourse
in a public model and encoding - this is the assumption underlying ISO 19109 "Rules for Application Schema". This will maximise interoperability between many potential information servers and clients.
We define full information-service interoperability as a deployment that allows a non-trivial client application to bind to multiple replaceable data services without re-configuration. This requires that every conformant data service publishes the same data product, defined in syntactic, structural, and semantic terms.
In the case of WFS, this means a service profile which specifies
- the common GML Application Schema,
- value-space/domain (vocabularies) bindings for all properties having literal values
- standard query patterns
On the latter: the generic WFS pattern is that the feature-type defines the query structure expressed as a "filter". For various reasons queries over all possible combinations of feature properties may not be achievable or desirable. Thus, it is often necessary to define a more limited set of queries that are supported by conformant services.
For a WFS deployed over an existing data-store or other data-source, use of a common schema, vocabularies and queries is likely to require a schema translation
step to convert the private representation to the public data product, for both the Request (incoming) and Response (outgoing).
Locating the schema translator
Schema translation can be accomplished at several positions within an architecture or processing chain. Most of these have been tried.
- on the client using data received from a service structured according to a private model, but using a standard syntax (e.g. CSV, GML)
- i.e. the traditional import approach seen in most desktop clients
- however, interoperability is not automated, since import is followed by a "semantic" interpretation step when a human user inspects the field names and maps them to their target data structure or portrayal
- do the mapping in a proxy-WFS, which acts as a client to the base service, but presents a standards-conformant service-interface outwards
- two example implementations are known
- Cocoon (XSLT) over ArcIMS (used by most contributors to TestBed2 and TestBed3)
- Ionic proxy-WFS (Java) over a raw WFS
- this is an example of a mediator architecture
- the mediator may be installed either
- directly on top of the service, at the server so the composite component presents a conformant public interface, or
- deployed as a separate service
- on the server, prior to the WFS server
- three variations are common:
- replicate the source data into a physical data-store (e.g. an ESRI GeoDatabase) in which the table structure matches the public schema - i.e. schema-translation into tables prior to the WFS layer, either one-off or using a cron-job
- a variation on this is to replicate the source data into a physical data-store in the form of an XML database with the public schema (this is/was the strategy used by Galdos in their Cartalinea/Cartegna products)
- replicate the source data into a "virtual" data-store in which the table structure matches the public schema - Oracle "materialized views" is a well-known solution that can be used for this
- this fulfills the requirements of most "commercial" WFS implementations, where the GML Schema for responses is derived directly from the structure of the storage layer
- this is a kind of forward-caching strategy.
- dynamically, in the server software
We need to flesh out the analysis of these strategies and write up the pros-and-cons of all.
- 25 Sep 2006, 2009-08-14