SEEgrid Roadmap - Information Viewpoint
An information model underlies every data access and processing system.
The model defines what object types may be operated on, and constrains what operations are available.
In the ideal case the model captures the essence of the technical language of the users.
Depending on how wide or narrow the user base is, the information model may be more or less specialised.
In turn this constrains the level of specialisation of processing applications within the system.
In general, information with richer semantics provides a basis for richer processing with less user intervention.
Distributed processing requires that the information is transferred between system components.
The semantics of the information transferred constrains the use that can be made of it by other components.
This is particularly important when using processing chains to accomplish complex operations.
The type of information relayed by each link must make sense in context.
The discussion is kept mostly at a conceptual level, but for illustrative purposes we present some implementation-level examples, primarily using XML and XML Schema Language.
Data models and semantics
In a computational system involving information transfer, the information model is realised in file or stream formats.
In many cases the description of the file format is the only formal documentation, so it may be necessary to analyse this as a proxy for the data model.
Some formats are targetted at generic processing applications, and only explicitly capture low-level semantics.
For example: html documents are for rendering in a web browser; spreadsheets are for loading in an application that manipulates data stored in table cells.
These are "generic", in the sense that the organisation of the information is independent of the application domain.
Higher-level semantics may be given in metadata or may be implied by layout conventions - e.g. column headings in tables – but this information does not directly affect processing.
Richer data models are based on the conceptual significance of the data, not just its structure.
For example, to support decision making in their domain, geoscientists usually talk about "faults", "plutons", "boreholes" and "measurements", not "points", "lines" and "polygons", and certainly not “tables”, “tuples”, “lists” or “pages” etc.
The latter are geometry-centric and representation-centric abstractions, which are necessary at an implementation level, but are not used when information is being conveyed between practitioners.
Similarly, when information is being transferred between components of a distributed processing system, an effective encoding will capture its meaning, and not just its geometric abstraction.
The aim is for the digital representation to reflect the language used by practitioners in the application domain.
In the context of web-services, the exchange format is usually based on XML, in an application language described using W3C
XML Schema Language.
XML documents provide direct serialisation of tree-based data structures.
However, the tree may reflect models at various conceptual levels, from page-layout, through tables, to fully structured conceptual data models.
Information models for geographic information systems
Most conventional GIS require the user to work with a geometry-centric data model (points, lines, polygons) loosely coupled to attribute tables.
Useful maps can be produced from these using colour, symbolisation and overlay.
The technology is mature and broadly available.
But these data models only give limited guidance as to its meaning, since the same structures are shared by data types that are semantically distinct.
For example, it is necessary to explicitly inform an application that the layer called "ROADS" is a valid basis for routing analysis, while "FENCES" isn't (unless they are electric fences), even though they are both formatted as sets of curves.
More sophisticated processing typically starts by requiring a human to interpret layer or column names.
The converse also occurs, where information with the same underlying meaning is delivered in varying structures.
For example, a physical field, such as gravity or temperature, may be represented by both a raster and set of point samples.
of the information in geometry-centric representations is typically captured in words accompanying the data
Interoperability can be achieved through use of standard words in layer names, attribute tables, and authority tables for attribute values.
These are often established by a dominant data supplier, but the approach might be consolidated by a clearly articulated governance framework to establish and maintain the model and terminology for the community of interest.
However, there are other limitations in the geometry-centric model.
In particular, an object is tied to a single geometry at a single scale.
While this has certain advantages in software implementations, it biases the representation of real-world objects towards simple models using a limited set of geometries.
The feature model
Newer systems have moved away from geometry-centric modeling, in favour of models pitched at a higher semantic level, using a model for information whose central concept is the geographic Feature
A feature instance is an identifiable object in the world, or the digital representation of it.
As a rule-of-thumb, if an object
- has a name, serial number, or can other wise be assigned a unique identifier
- is interesting enough that a description of it might be transferred from one party to another
then it is a candidate "feature".
This may include things from the following general categories
- physical objects created by people, such as mines and boreholes
- natural objects, such as islands, faults, beds
- objects defined by a combination of natural and social/economic factors, such as ore-bodies
- transient objects such as events, including observation events
- "coverages" - objects that encapsulate the variation of a property in space &| time
Note also that the same information set may be considered for different purposes, and therefore expressed as different feature types.
For example, given a set of observations made on a set of specimens collected in a borehole, we might
- describe each observation separately, with its metadata describing the procedures applied, date and operator etc, as an "observation" feature
- bundle observations on a specimen-by-specimen basis, with a set of properties for each "specimen" feature
- bundle results to show the variation of a single property along the borehole, as a "coverage" feature
These are all legitimate views of the information, and thus sensible feature types to use.
See more discussion at https://www.seegrid.csiro.au/twiki/bin/view/Xmml/InformationViews
Feature types and feature properties
Following classical category theory, features are classified into feature types
on the basis of common sets of characteristics or properties
(see General Feature Model (GFM) defined in ISO 19109
The properties may be:
In the REST
architecture which underlies the Open GIS Consortium Web Services model, interactions are stateless.
The practical impact of this is that while services exhibit behaviours, only static data may be transferred.
Consistent with this, an XML "document" is a static data representation, so only attributes
can be described on the wire.
For consistency with the GFM, the term "properties" is used in discussions of GML.
In languages based on GML, each feature instance is described using an XML element, whose name indicates the feature-type.
Sub-elements and (occasionally) XML attributes record the properties.
for more detail on GML patterns and their relationship with other information model approaches.
Property values, value-spaces
We distinguish between the property-type (its semantic value) (indicated in GML by the property XML element name), and the property-value (given by the element content).
The value may be a literal (WXS simple type), or may have explicit substructure (usually instantiated in GML using sub-elements whose structure is described in WXS complex-type).
- all features have an (optional) property description whose value is some text describing the feature using natural language;
- many features have a property position which has a Point as its value. Point is itself an "object", and has the property pos which contains the coordinates, and another property srsName gives the coordinate reference system.
For many properties in a domain-specific language, their values are required to be members of a specific value-space
Text values might be selected from a code-list or vocabulary, or be required to follow a certain pattern.
(Boolean can be seen as a special case, where the value-space only has two members.)
Numeric values must have a unit-of-measure and may be limited to a certain interval or precision.
It is essential that the model provide a means to either
- constrain the property values to the relevant value-space, or
- indicate the value-space being used in the current instance.
A key part of establishing a community schema is the selection and prescription of the vocabularies, units, reference systems etc that describe the relevant value-spaces.
Although "features" as used here almost always have a spatial context, the feature model does not consider geometric and spatial properties of features to be different to other properties.
A feature may have multiple geometries, each labelled with a role such as “centroid”, “boundary”, “trace”, “shape-at-1:25000-scale”, etc.
Note that even in a feature-oriented system, a geometry-centric model will often still be used behind the scenes.
The goal of the feature-centric approach is to at least insulate the user
from this abstraction, in favour of a model that operates at a level which is more natural for problem solving in the application domain.
These alternatives thus correspond to different levels of abstraction
, both of which may be realised in different layers
in an implementation:
- the feature model corresponding to domain concepts will be shown on the interface
- a lower-level abstraction, such as the geometry-centric model, may be used internally and for storage.
But if a feature oriented view of the information can be provided on the user-interface, it is a small extra step to make this available through software interfaces.
This then supports the deployment of a semantically-aware service architecture.
The principal alternative to the feature view is the geographic coverage
, which focusses on the variation of a property within the spatio-temporal domain of interest.
This is described in ISO 19123.
Coverages are commonly encountered based on a grid (e.g. imagery) though any geometry complex of any dimensionality may be the domain of a discrete coverage.
Note that the term coverage is used in a subtly different way in some common GIS software.
Often typed on basis of domain geometry
Semantic typing alternative
Information models and interoperability
A vocabulary is always tied to a community.
The community may quite reasonably be scoped in a variety of ways, such as:
- a single work-group or enterprise
- a cartel or group of enterprises that have common transactional relationships
- an industry, at a local, state, national or international level
- a technical discipline, sub-discipline or group of disciplines
The size of the community that agrees on a data model fixes the boundaries of interoperability.
In order to be a member of a particular community you must agree to speak and listen to the community language.
Local or private models
Most existing data models are schemas developed within a single organisation, within which software tools will be available to manage information according to the model.
The interoperable community is effectively limited to the organisation.
The organisation may also choose to publish a description of the model - perhaps even a "GML View" - in order to make information products available.
However, the client
is stuck with the job of dealing with information provided in a model that is foreign to them.
So if information from two different sources but covering the same topic is to be reconciled, the client must convert one or both datasets.
A client that wishes to access many information services will need to understand all of their models.
The proposition of the SEEgrid architecture is that the relevant community for many purposes is larger than a single organisation, and communication between the parties within the community should use a common language.
The language is primarily composed of the feature-types of interest, and information services
should therefore provide data products that are based on the community feature types.
Publishing data in this way will usually involve a mapping
from a private model (the storage schema) to a public model (the community schema), but the burden of translation is pushed back to the server
This is entirely appropriate, since the information service owner will have the best understanding of their internal data model and is in the best position to map it to the community model.
The community model is the lingua franca
, but having this common point means that interoperability requires order N mappings (to and from each private model to the community model) rather than order N2
. It is expected that this reduced complexity will eventually be seen to be the driver for significant cost/benefit advantages. (This is a topic covered in the EnterpriseViewpoint
, however the metrics for assessing some costs derive from the information design issues raised here).
Note that it is entirely possible, and indeed quite reasonable, for feature types with the same name to be defined independently by different communities, resulting in definitions with different models.
For example the communities may have have different interests in a feature type (e.g. use vs. maintenance) leading them to focus on different properties of the same feature.
Effectively, these definitions are of different feature types, and the communities will be unable to share instances in any meaningful way.
Practically this is managed by explicitly identifying the authority for the definition in the serialisation, e.g. using XML namespaces, where the name "use:Road" is completely distinct from "dig:Road".
The Feature-type catalogue (FTC) is the primary vocabulary for the community, defining the nouns in the application language.
A complete FTC provides
- a list of feature types
- relationships between feature types
- definitions of the feature types, in terms of their properties.
A FTC may be implemented using a number of different technologies.
Use of a formal notation, such as UML, Express, or W3C
XML Schema (WXS) is essential to remove ambiguity, and also for software production.
Within the OWS context, the default implementation is as a GML application schema.
This conforms to the requirements of the Web Feature Service (WFS) interface, in which the response to a DescribeFeatureType
request is the XML schema
for the feature type.
Organising the Feature Type Catalogue
The FTC may be instantiated as a simple list
of feature type definitions: this is shown in ISO 19109 and ISO 19110.
Thus, the complete XML Schema
describing the GML representation of a set of feature types and property definitions - i.e. the GML Application Schema - can act as a formal FTC.
The WXS definitions in the application schema provide
- a description of the structure of the feature types of interest, in terms of their content-model (i.e. the XML Schema type definition), leading directly to the associated syntax for encoding this, as an XML document
- implications of some semantic relationships between feature types, using WXS "substitution groups" (and the supporting type derivation chains).
The semantic relationships between feature types are usually quite important.
For both data discovery and publishing, it is necessary to be able to explore the set of available feature types, in terms of their definitions, but also in terms of the relationships between types.
For example, for some operations a "fault" may be needed, while for other purposes the more generalised "geological boundary" may be adequate.
However, faults should be included when requesting boundaries.
However, in this area WXS is limited.
Elements may only be assigned to a single substitution group, and membership is constrained by the requirement that the WXS type of the member is derived from the WXS type of the head - effectively a single-inheritance model, supporting a single "semantic" hierarchy.
In many cases semantic relationships are
underpinned by structural relationships, so inheritance of properties is appropriate.
But this does not always match the conceptual model.
Furthermore the peculiarities of WXS sometimes get in the way of developing the required derivation chains.
Finally, multiple indepedent classifications of feature-types may be required, for example in a "facetted" classification system.
Thus, it may be useful to assert the semantic relationships between feature-types independently of the WXS definitions, and to provide multiple interfaces to the feature type catalogue in support of multiple hierarchies.
There are a few methods that might be used to support this.
The OASIS Registry Information Model
addresses the issue directly by supporting multiple classification views of the same objects.
The Web Ontology Language
(OWL) is an RDF-based serialisation for semantic relationships in the form of assertions linking one resource to a another.
The nature of the relationships between resources can be defined explicitly, and includes but is not limited to "subtype-of" type relationships.
Ontologies provide for formalisation of complex semantics.
SEEGrid will use ontologies as its basic unit of semantic agreement (i.e. common understandings and classifications will be treated as ontologies, not just vocabularies).
This means that "word lists" will be instantiated as objects that can, for example, be further described, cross-referenced, versioned etc.
This is consistent with the current state-of-the art within the ASDI context, for example the National Oceans Office portal design requirements stresses the role of ontologies.
It is also consistent with mainstream IT developments, in particular the ebXML standards framework has a registry model (RIM) that supports interrelationships in what is effectively a registry view of an ontology.
The set of feature types and the relationships between them defines a feature-type ontology
for the application.
Interfaces based on this are a key to
- discovery of suitable data by a potential data consumer, and
- assignment by a data provider of data to types from a community language
The technologies used for strong- and soft-typing come into play as follows:
- Relationships between strongly-typed feature types are primarily expressed in the UML context by class-inheritance, or in the XML context by substitution group chains that depend in turn on type derivation by extension and restriction. XML Schema derivation only supports single-inheritance explicitly, and there are certain other restrictions on derivation emerging from the peculiarities of the XML Schema language. Overall, the set of relationships that can be described in either UML inheritance of XML Schema derivation is usually incomplete relative to the complete set of conceptual relationships that exist in the application domain
- Relationships between weak-typing classifiers can be asserted directly, for example by using OWL (Web Ontology Language). Relationships can be added arbitrarily, with no requirement for any underlying relationship between the feature-type definitions (i.e. the property sets can be completely disjoint). This supports more powerful and flexible discovery mechanisms, but it also does not ensure any conceptual integrity to the ontology.
These mechanisms need not be exclusive.
For example, in order to gain the processing benefits of strong-typing, a feature type catalogue may be provided as an XML Schema defining the feature-type structures, but an associated OWL model may be provided describing the relationships between the feature types.
The latter provides a highly structured "index" of the catalogue, supporting a richer "discovery" view with additional relationships that are not possible from the XML Schema.
Duality of GML Feature Catalogue (strongly typed) and Ontology (used in weak typing slot)
The SEEGrid Roadmap introduces a novel, but practical, strategy for dealing with the dilemmas posed by different drivers for strong vs weak typing: a mechanism for formal equivalence.
Whilst some experimentation is still required, strong and weakly typed schemas can be derived from an "ontology of feature" types view, where the ontology supports abstraction relationships as well as lower level property data types.
This is in fact a formalism of the ISO 19110 Feature Type Catalog and promises to have the following advantages:
- manage a richer feature type description than possible just in UML or XML-schema
- derive multiple mappings from the feature types to different weakly typed template objects
- provide vocabulary and ontology views naturally to support classification of external objects
- manage multiple representations in a single "code base"
- easily extensible without compromising a self-contained schema or model
- fits easily into community registry governance model
- can be done in standard XML technologies (OWL, XML-schema, XSL, GML)
Developing an Information model
A general methdology for developing a community application schema is outlined in ISO 19109, from which the feature catalogue emerges. It involves four steps:
- surveying requirements from the application field
- making a conceptual model using the concepts from the GFM
- describing the application schema (feature types) in a formal modelling language (e.g. UML + OCL)
- integrating the application schema with other standardised schemas (spatial schema, temporal schema, etc)
These steps are not consecutive, but provide a framework, upon which we can examine the status of the application domain of interest: Australian geoscience.
Development and maintenance of a feature-type catalogue will involving various levels of consultation and consensus (ISO 19135
However, in order to be acceptable and useful within the community, it must capture established usage at an appropriate level of detail.
Exploration geoscience is a relatively mature application domain, with a history of information sharing driven by an important mining industry, an active statutory sector, and containing a vigorous software-development sector.
Thus many data models are already available.
Primarily these are comprised of database schemas and export file formats from application software.
A particularly significant Australian initiative was AMIRA P431, which developed a consistent model for exploration data.
Another important contemporary initiative is the North American Data Model (NADM) developed by USGS, GSC and the state and provincial surveys for mapping data and intepretations.
A significant fraction of the information for step 1 is therefore already available, at least implicitly, through reviewing the legacy models and formats.
However, given the abundance of information available, determining the (current) model scope is necessary in order to allow focus on the relevant legacy material.
Given that many of the existing models were developed for limited or specialised purposes, it is important to do regular scope checks, and to ensure to take input more broadly than merely reviewing legacy models and formats.
Conceptual model development
In general, the modelling methodology used in geoscience has been inconsistent.
Similar information (e.g. location) is handled differently across tables, even from the same organisation.
In AMIRA P431 the meta-model is provided only implicitly, by its use of E-R notation and a particular CASE tool ("System Architect").
A notable exception was the sample and site oriented databases held at GA, that were unified under the "sites" model in the 1990's, though this is now being superceded.
NADM is primarily based around a meta-model for "concept", and a firm distinction between observations
(evidence) and interpretation.
But overall, currently available models need considerable adaptation in order to use the GFM.
A critical part of step 2 is to identify distinct feature-types of interest.
At this point questions of feature-type granularity become important: when to split categories into more specialised types.
Splitting results in simpler and cleaner data instances, fewer optional components in models, lighter-weight processing modules, and more specialisation within the associated authority tables constraining property value-spaces.
However, a profusion of types is difficult to use without an effective index and accessibility mechanism, especially if there are extended derivation and inheritance chains.
More types also imposes a model maintenance burden, and clients carry a cost in having to implement more modules.
A model with fewer, more generalised features, may be easier to maintain, and more elegant.
But generalised features also require more flexibility in their model, with many properties "optional" because they are used by only a subset of the type.
Generalisation also requires users to be willing to lump features, or apply abstractions that may not be immediately obvious (e.g. a well-log is merely a 1-D example of a "coverage", most usually encountered as the model for 2-D gridded data and imagery).
Furthermore, in order to capture the complete semantics, it is likely that a "soft-typing" parameter will be required, to indicate the relevant sub-type when a more generalised feature-type is used.
Modeling has to offset the need for specialization and precision against simplicity, which in turn determines the balance of the processing burden between data provider and data consumer.
Thus, the issue of granularity is strongly related to strong- vs. weak-typing approaches, discussed at some length in StrongWeakTyping
Formalising the conceptual model
Use of a conceptual schema language
The technology used in both modelling and implementation has an influence on the likelihood of success with each approach.
The ISO 19100 standards formally require that a "model driven architecture" approach is used.
This involves development of a complete information model, using a suitable conceptual schema language ("CSL", usually UML).
Since UML is a graphical modelling and analysis notation, it must then be converted to the desired implementation model - for example, a database schema for persistence, and XML for transfer.
The MDA theory is for this to be generated automatically as far as possible, by application of a set of conversion rules.
The strength of this approach is that multiple implementations can be generated from the same conceptual model, with assurance that they are fully equivalent.
Alternative modelling platforms
This is strictly possible only if each implementation platform has the capability of fully implementing all of the capability of the CSL.
In practice each implementation platform has different strengths and quirks, and indeed so does UML.
So each translation inevitably distorts or dilutes the original model, which means that round-tripping involving anything more than a subset of the capabailities of each language is almost certainly imperfect.
Furthermore, simple application of mechnical translation rules also means that potentially useful or efficient capabilities of the implementation platform are ignored if they do not map to a capability of the CSL.
For example, XML is a static data notation, so UML class operations must be either ignored, or represented by elements indistinguishable from those representing attributes and associations.
When converting from XML to UML there is no way to detect these.
There are richer rules concerning element cardinality, and choice models within content, in XML Schema than UML.
Element order is significant in XML documents, so relationships between information items may be indicated by proximity instead of nesting, in a way that is meaningless in UML.
Perhaps the biggest strength built-in to XML is hyperlinks, which supports associations with remote information items compactly and directly.
Breaking the "closed-world" assumption of conventional information systems is perhaps the biggest innovation of web-based information architecture.
The same capability can be modelled in UML, but is not native so does not emerge naturally from modelling exercises based in UML.
There is no question that UML is the most powerful tool available for describing and analysing object models, of which applications schemas based on the GFM are a special case.
So, for step 3, we have found that there is considerable merit is using other notations, such as W3C
XML Schema Language, for modelling in parallel.
Useful "idioms" emerge that can then be reflected back into a UML implementation.
If WXS is preferred for modelling, then UML can serve a very useful function of documentation
, since it is a standard graphical notation.
The GML data model (meta-model) is highly regular (see https://www.seegrid.csiro.au/twiki/bin/view/Xmml/GmlFeature
), so provided that the GML Rules for application schema are adhered to, or provided the profile of UML described in ISO 19103 is used, then generation of GML schemas and instances from UML and vice versa is straightforward.
The rules for converting both ways are given formally in an annex to the GML 3.1 recommendation
Note that NADM does not
follow ISO 19103, though this may be resolved through a collaboration currently underway under the auspices of IUGS.
When formalising the model, the scope issue surfaces particularly in terms of whether the intention is to develop a comprehensive model, or whether incremental, prioritised development, only leading to a limited set of feature types nominated by key stakeholders is acceptable.
The latter approach reduces the risk associated with "big-bang" development, and is more scalable.
But subsequent development of additional feature types inevitably leads to a re-examination of the existing "completed" components, which is likely to require that a versioning mechanism be introduced.
Use of a modular platform, such as XML with namespaces, and UML with packages, provides good support for incremental development.
Integration with standard components
In the context of Open GIS Consortium Web Services (OWS), step 4 is realised by the development of a GML Application Schema.
This describes an XML language for the feature types in the application domain of interest.
A GML application language has the following key characteristics:
- a pattern of element names and nesting that directly instantiates the GFM model for each feature type
- standard capabilities (spatial, temporal, corodinate reference systems, others) are implemented through importing components provided in the core GML schemas
- the normative version is expressed using WXS.
Development of a conformant GML application language is the subject of a major clause in the GML Recommendation paper
A paper on Developing and Managing GML Application Schemas
compiled by Galdos Inc
is available from GeoConnections Canada
XMML and related languages
The eXploration and Mining Markup Language
(XMML) project has been developing a GML Application schema for exploration geoscience in collaboration with several of the sponsors of this project.
For a summary of the project goals and current capabailities provided by XMML, see https://www.seegrid.csiro.au/twiki/bin/view/Xmml/ProjectSummary
The Feature Types developed for XMML v 1.0 are primarily artefacts of exploration
activities (boreholes, observations, procedures etc).
Other relevant feature types are under development in complementary projects as follows:
- Geology features are the subject of projects underway in some of the Australian surveys, in British Geological Survey, and most notably in the long-running North American Data Model from USGS/GSC, which focusses on types found on geological maps. Harmonisation of NADM with the OGC/ISO Feature model and GML/XMML encodings is underway.
- Geochemistry/Assay data - through the ADX project
- Plate tectonics descriptions - through the GPlates/GPML project
In most cases these will produce FTC's in other namespaces.
Governance of the FTC: how do I get my feature type included?
We need to establish syntactical and governance mechanisms so that interested members of the community can choose, per feature type, as appropriate:
- to reuse a feature type already defined and registered in the community FTC
- to extend a feature type, be it abstract or a similar type of object (eg "Hazard")
- to create a new feature type, and document it well enough to allow reuse by others (including conforming to the community baseline standards and governance models)
- register it as a "foreign" with a mapping to a feature type from the community FTC for information purposes
- create private feature types not intended for re-use by others.
Service interface to Features: who translates the data?
The OGC's Web Feature Service (WFS) is the canonical interface through which a data provider publishes descriptions of feature instances.
However, the WFS specification is neutral regarding the design of the GML language exposed by a WFS service: a service using a GML-ised private model is conformant.
There is no requirement to use a model defined by a community.
Thus perhaps the biggest information viewpoint
issue is whether services publish their offerings merely using a version of their corporate data model, or alternatively map their data to some community schema on the interface.
The former is easier for the service provider to implement, and is effectively what most of the COTS WFS systems support now.
But it pushes out to the client the processing burden to reconcile data sourced from multiple services and expressed in different models.
Note that the community approach is at least implied, and is arguably quite explicit in ISO 19109 and ISO 19110.
These standards do not
concern the WFS interface directly, and there are plenty of vendors who resist the notion that a useful
WFS (as opposed to a merely conformant WFS) includes a model/schema mapping layer.
This means that a degree of coherence between different service interfaces can be established using externally defined Feature Types:
* Relationship between Feature Types and Service Interfaces:
However, the "lazy" approach is used by most existing WFS software, in which the GML is generated by a direct mapping from the table structure of the source, so the feature-type definitions are directly related to the storage model.
But if you want to compare information coming from different sources, then it has to be made commensurate somehow.
It either gets done by the server or by the client, and we suggest that as the organization hosting the server understands their data best, they should be responsible for the mapping.
We suggest that interoperability
is best accomplished by the server accepting the responsibility of mapping to a community data model.
This supports the deployment of lighter-weight clients that can pre-configured to the model.
But it requires some governance process for the "community".
Using software designed to support the "lazy" approach, if the service provider wishes to publish using the community
model, they must convert the storage accessed by the WFS to a schema corresponding to the public model.
For most organisations this will result in replication of their data: once in their private model that serves most of their business purposes (and probably also includes private fields), and once in a cache to support the community WFS view.
Synchronisation of the two data stores then becomes an issue, particularly if the WFS is transactional
- i.e. an information upload as well as publishing service.
For more discussion of mappings between the feature model/GML and conventional tables schemas, see https://www.seegrid.csiro.au/twiki/bin/view/Xmml/InformationModels
- Registry Information Model http://www.oasis-open.org/committees/regrep/documents/2.0/specs/ebrim.pdf
- OWL Web Ontology Language, Overview http://www.w3.org/TR/2004/REC-owl-features-20040210/
Back to RoadmapDocument