"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

SEEgrid Roadmap - Enterprise Viewpoint



Review of Business Drivers

This section reviews some of the core business drivers that impact on the architecture of a SEEGrid.

Economic factors

GRID computing and SDI share a common rationale - more opportunity and less cost. The key to this is the "commodification of access arrangements" - i.e. the establishment of a regime (aka a DataGrid) where access arrangements to resources (data, computing, network) is agreed beforehand and thus workflows can be established by either users (or programmers), without the typical overhead involving specialist skills and mandates such as accountants, business development, lawyers, policy development, etc.

A comparison of the "*Total Cost of Business Outcome*" [I made that up but we do need a good buzzword to sheet the concept home] between the SEEGrid concept and typical current practices can be made by fully accounting for all the activities required to deliver an information product, with and without a DataGrid. TCBO estimate example to be developed by TimMackey + PaulTreloar ??

The purpose of including this comparison within the SEEGrid Roadmap is to highlight the nature of real world interoperability considerations and ensure that all aspects are within addressed the scope of the governance framework and architecture.

External Influences

In addition to technology and standards bodies, and the direct influences of relevant policy frameworks, the SEEGrid framework must provide for effective collaboration with other significant drivers within its subject domain and technical environment.

GEON (US) - "A cyberinfrastructure for the geosciences"

The GEON project addresses many of the same issues within a broadly equivalent sector within the US. Much can be gained from closely monitoring its organisational and technical evolution. The approach of the GEON project may not however be directly transferable to the SEEGrid context because the institutional environment is different and the level of resources available are different. The table below provides a comparison of how the GEON approach and SEEGrid requirements match. Of particular interest is the effort to establish common ontologies within the GEON community.

GEON Approach Commonalities with SEEGrid Differences with SEEGrid
Development of common ontologies through stakeholder Deployment of shared ontologies as a interoperability principle
GEON common ontologies as useful resources
Resources required out of scope
SEEGrid domain is broader
Stakeholders will be engaged over time
single vendor GIS technology base (ESRI as industry sponsor) GIS is a relevant technology
ESRI products in use by many stakeholders
traditional GIS not most important technology focus
No resources or mandate to enforce a single vendor deployment
NSF funding Analagous to the AEON community (see below) seen as SEEGrid stakeholders GEON is research oriented (NSF), but with funds to develop infrastructure.

NERC Data Grid (UK)

The NERC Data Grid (NDG) is a UK e-science project to establish a coherent access capability across the data repositories managed by NERC. This activity is particularly significant because:

  • It has a strong focus on custodial arrangements for data
  • It provides a exemplar for the bridge between government and research communities
  • It has undertaken significant research into the practical issues of managing stakeholder access to distributed data sources
  • It has identified that information community semantics and interoperability via Web Services is the next key development, and that a collaboration with SEEGrid is a fruitful way to undertake this exercise

This appears to be nicely complementary to the current project.

AEON - Australian Earth and Oceans Network

AEON is an emerging network of researchers in the "Earth and Ocean" domain in Australia, in the process of applying for funding to ARC for establishment as an ARC Network. The AEON community is comparable with GEON, but it will be resourced only to provide a coordination function. AEON will look to independently funded external activities (such as SEEGrid) for the establishment of the data access and computational infrastructures.

Close collaboration exists between the AEON and SEEGrid teams. The two activities are naturally complementary, potentially providing technical interoperability brokering and communication channel functions between two large, diverse stakeholder communities.

ASDI - Australian Spatial Data Infrastructure

The ASDI is not an operational program at this stage, nor has it yet specified, adopted or identified critical interoperability standards for transferred information and service metadata. Rather, SEEGrid is seen as a driver for the ASDI, which will be created through such initiatives adopting compatible frameworks. At this stage the ASDI has only deployed the Australian Spatial Data Directory - a Z39.50 based catalog of metadata about data sets. This will be useful as a classification vocabulary in the medium term. SEEGrid should provide insight into the future developments and requirements for such catalogs within a service oriented architecture.

There are expected to be a number of key activities that will contribute deployed capabilities or reusable standards to the emerging ASDI, with cross-jurisdictional activities being by far the most important. To date these include:
  • National Land and Water Resource Audit
  • ICSM common data model
  • National Oceans Office Portal
  • ASBIA Interoperability Demonstrator Pilot

SEEGrid will need to resource ongoing liaison with these activities to ensure cost-effective implementation.

Technical Standards

This section explores the fundamentals of why and how to adopt technical standards and which standards to adopt. Technical standards will allow the SEEGrid vision to be decomposed into specific components that can be safely developed, deployed and updated independently by multiple stakeholders.

A brief outline of relevant technical standards is provided in StandardsFramework.

Layered standards

A key concept in this Road Map is that standards are layered. They build on each other, and each standard has a particular role to play. We must also distinguish between the marketing view of standards - which tend to be focussed on the potential benefit (or buzzword conformance) within a few of these roles, and a robust architecture which must fully explore how the relevant standards fit together. We assume that certain layers can be taken for granted, while others - primarily the more local or domain-specific elements - need to have specified policies, technical development or deployment strategies applied.

For example, there are a plethora of technical standards underpinning the humble electrical power socket, and a small number of basic patterns for these. Nevertheless, when designing an electrical appliance, it is only the well-known interface that is necessary to conform to.

When developing web-resources it is not usually necessary to know the specifications for TCP/IP, DNS, or even much about http. But at this early stage in the development of the Web Services paradigm, some understanding of lower-level or generic standards is required when designing a geoscience data grid.

For example, WSDL - Web Services Definition Language - is a means of describing the syntax of an interface with a web service. Interoperability however also requires that the nature of the service and the data to be transferred has a common meaning to both the provider and client. Thus OpenGIS Web Services may be described in WSDL, but so also may undocumented proprietary processing functions.

Multiple interfaces

Furthermore, a single business function may have multiple interfaces, each conforming to a different technology platform. Thus a single map rendering services can support a proprietary API and the OpenGIS Web Map Server interface. Of perhaps more direct relevance is that a single data base can support multiple representations (Feature Types) through WFS interfaces, equivalent coverages through WCS interfaces, catalog functions through Z39.50, directory functions through LDAP and an ontology view through OWL (an XML language). All are standards, yet all have specific advantages or functions.

Criteria for relevance

A requirement of this Roadmap is to articulate the broad decomposition of the infrastructure into components, where the abstract interfaces define the roles of standards in delivering interoperability. From this perspective it will be possible to identify an initial set of specific standards that meet a broad criteria for adoption.

The Engineering Viewpoint within this RoadMap will provide specific guidance as to the relevant standards layers and a baseline set for initial implementation phases.

Stakeholder relationship/system diagram (to be attached)

Deployment strategy

The immediate challenge facing SEEGrid is the establishment of deployed services that can act as a nucleus for future projects to enhance the capabilities available. Of particular importance are the data access services that establish common information models. From this base it becomes reasonable to expect the evolution of visualisation services and business applications (basic SDI issues) but also the capability to start to evolve additional data through ability to promulgate modelling services and dissemination of modelled results.

Thus, each project within the emerging SEEGrid infrastructure must provide for a legacy of discoverable information products.

Semantic resources

There will be diverse set of initial capabilities, information models and business applications. It is not expected that all activities must fit into a common semantic framework, but it is necessary that each deployed component provides for the publication of relevant metadata in a machine readable form - including:

  • data structures
  • vocabularies (or more formal ontologies) used
  • service locations

This will allow the development of either "wrappers" or "crosswalks" between different components as the need arises, without special agreements or expensive re-engineering. The components involved may provide complementary service functions, or may even be from different application domains. Nevertheless, use of existing semantic components (catalog and data standards e.g. both de facto and de jure) is to be encouraged, and must be promoted as a means to reduce the cost and effort of designing and building new components.

Technology deployment

Technology deployment strategy should prioritise the approaches required to deploy persistent data access services against:

  • key existing data sets
  • key existing data management technologies
  • open source options

The crunch

The main dilemma to be addressed is: An organisation with installed technology X is willing to serve its data, but the technology can only support an automatically generated view, published according to schemas that directly reflect the storage schema, and not the community information model. Options available are:

  1. support the registration of such services - thus requiring clients to undertake programming to interoperate
  2. resource the construction and deployment of an appropriate "wrapper", either installed locally around each service, or as a mediator service
  3. harvest the data in its entirety, using the proprietary schema, into a conformant repository (i.e. a "forward cache")
  4. develop supported profiles of community schemas that can be implemented through configuration of internal storage schemas

None of these is ideal.

Consideration must be given first to the testing of technology options against support for real community schemas, and the resourcing of technology options that are acceptable within the most common or strategic business contexts. This could be accomplished by specifiying conformance to the community schema within procurement activities, so that vendors become motivated to establish suitable capabilities.

It is recommended that a register be kept of successful implementations of community schema with available technologies, allowing download of configuration files to assist new data suppliers to quickly meet a common standard. This requirement has minor implications for the system architecture, since such a register must be seen as part of the community schema-management process.

Governance Framework

Each component within SEEGrid must be available for discovery and use by some set of the SEEGrid stakeholders. This implies a lot more than mere technical connectivity and protocol conformance. The governance framework for establishing common registries (including at least data models, vocabularies, service registries and authorisation profiles) defines the information community ("SEEGrid"), acts as the reference point for common semantics, and provides for service accessibility (access, authorisation and accounting).

A goal of SEEGrid is to evolve the most effective and least-cost governance framework that allows sub-communities to converge on interoperable patterns. The governance model must also be capable of scaling to include variable access priveleges and accounting.

The SEEGrid governance framework is intended to be established in the following initial phases:

Phase Governance requirements Responsibilities
1 - Demonstration Common information model and a sample processing chain.
Vocabularies of data services published as network resources by data providers
Registries managed by project team for duration of project
SSEGrid Roadmap project team (Geoscience Australia, PMD*CRC, CSIRO, Social Change Online)
2 - Initial Capability Registries established under a Service Level Agreement
Registries managed according to ISO 19135 principles
Feature Type Catalogs for all supported domain models
Components must be conformant to a registered domain model OR publish a complete new domain model
TBD - Would be good to be able to flesh this out though
3 - DataGrid Data repositories established
Virtual Data repository management (ability to store workflow outputs and preconfigured workflows)
Access, Authentication, Accounting framework established
Formal interoperability arrangements with international DataGrids
TBD - GRID Computing infrastructure services

Future phases would be based on specific requirements from emerging business applications, SDI and GRID infrastructures, and academic network capabilities. At some stage the close integration of modelling

Conformance

I made this up - seems like common sense but there may be a better blueprint or we might not want to include it all?

SEEGrid is likely to require a base set of standards to be mandated to ensure that the there is a consistent implementation of community semantics across various components. Naturally, however, the SEEGrid framework will be extended to include new data sources, new processing capabilities and new technology opportunities. It is proposed that a set of conformance profiles be established under a versioning scheme so that:

  1. Under Version X, new service interfaces and data models may be added that do not require redefinition of current standards.
These may be mandatory (for a given service) if they extend without replacing existing component standards. New interfaces may be added as optional without causing a version migration.
  1. Under Version X+1, some service interfaces and data models from Version X may be deprecated and new ones introduced as mandatory.

Conformance testing facilities for common services are encompassed in the notional architecture. These are a key support mechanism for both the community and the agencies managing SEEGrid infrastructure services.


Back to RoadmapDocument.
Topic revision: r2 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).