"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Document Refinement

The Big Question

From what we've learned in this project so far, could Geoserver or any commercial WFS and the WFS spec itself support production work of geologists in Australia?

Answer - a qualified Yes.

What this paper needs to do:
  • explain the qualifications to that Yes
  • present ideas on how we remove those qualifications & the next step

from a conversation between RobertWoodcock and AndyDent - 20 Dec 2004

Refining the Doc

StuartGirvan has produced the initial draft of the attached document.

I'd like to thrash out some feedback here rather than in the document because of my relative ignorance (which probably makes me a very good reviewer) and to avoid cluttering it with the discussion as a change history.

Would it be fair to say that a goal of the document is to make it possible for any reasonably competent person to take the document and use it as a specification to get the relevant software configured into a working system? I think SimonCox has several contacts interested from this angle.

From my point of view this isn't a user guide for the advanced bits of Geoserver. That should be in a seperate place and actually should be rolled into the Geoserver documentation when the updated code is rolled into Geoserver. Until then it can sit somewhere on SEEGRID but it will require a whole extra excercise to get those bits together. My primary goal with this document was to highlight what we learned about the software, the technical interface and the archietcture, what we achieved, what we'd still like to be able to do and big red flags to anyone else who was going to try and do a similar thing. -- StuartGirvan - 20 Dec 2004

If so, what else do we need to include, can we for example include a copy of a complete set of configuration files?

-- AndyDent - 17 Dec 2004

Also discussed with RobertWoodcock and would like to make sure the Twiki hosts copies of the relevant site-specific configuration files to serve as an example for anyone reasonably savvy to pick up your narrative and work from there. Agreed this is not a Geoserver advanced guide! -- AndyDent - 20 Dec 2004

We certainly could do with an exemplar page for other users. I'm not volunteering for this one however as I've been spending way too much time on this project recently.-- StuartGirvan - 20 Dec 2004

In the intro paragraphs to the Specifications and Technology section, need to specifically note that the project has exposed/illustrated issues with the WFS specification as well as with the Geoserver implementation. -- SimonCox - 22 Dec 2004

Specific Issues to Clarify

Oracle Hacks

Oracle Hacks...SRID of 8311 hard coded into Oracle spatial query

Is this a hack into an Oracle product itself, into a mapping layer in Geoserver or elsewhere? -- AndyDent - 17 Dec 2004

This was a hack in the Geotools code - will add to document. -- StuartGirvan - 20 Dec 2004

Denormalisation

currently available software tools implementing the WFS requires that the underlying database structures have to be denormalised into a flattened structure before they can be accessed. This defeats the idea that web service interfaces can be used to exchange data without having to change corporate systems that have been developed to meet existing business requirements.

I don't understand the point being made here.

An interchange layer may be required to map a corporate system to a WFS but that's not implying changing the corporate system.

Maybe I'm not sufficiently savvy in what you can do with database views but I was under the impression that a denormalisation exercise like this would normally be done via a view to a flat virtual table, so you then have your one-one mapping with the XML interchange?

-- AndyDent - 17 Dec 2004

Yep you could always use views but there are drawbacks. If you do use views then it's another object you have to maintain and manage in the database, which is a change to the corporate system. If you hava a number of different feature types you want to support then then you'd have to set up a number of views. It may be semantics but by allowing flexibility in Geoserver it means less maintenance and mucking about at the back end. That's not to say you won't ever need to change the back end, just that the less dependent you are on having to change it the better. Andy do you think this is still splitting hairs or can you think of more appropriate words?-- StuartGirvan - 20 Dec 2004

Hmmm.

I have several competing feelings about this:
  1. I'm extremely uneasy about doing too much joining and filtering in application layers (ie: Geoserver) rather than in a honking-great big database engine because it destroys the opportunity for smart query processing and caching - if you're gonna pay for an Oracle server then it should give you the money's worth and I don't see the Geoserver people replacing it.
  2. Anything specific to a given database backend is not terribly useful to the community.
  3. We will always have tradeoffs between robust performance and flexibility. I don't believe in single solutions
  4. It feels like this issue is closely related to the issue of returning too much data which SimonCox and I had a long discussion about this morning, leaving me less happy about the maturity of WFS (if possible) but more informed.

Sort-of conclusion: flexibility in Geoserver is a good thing but we should also be able to migrate some cases into smarter database use, possibly using views or other techniques (I am not an Oracle guru) to combine data.

Observations:
  1. A View added for a specific purpose is not changing any part of the corporate system on which other applications depend - it's an addition rather than change (splitting hairs fairly finely).
  2. If you're adding an external layer to perform complex mapping vs adding a view and simple mapping layer, there's still some degree of work.
  3. Database views provide a nice testing seam (see Working Effectively with Legacy Code) at which you can test the data being returned, rather than hitting your WFS.

-- AndyDent - 20 Dec 2004

Andy, all useful feedback. I'll add in something like the conclusion you've got. I should point out that in the document I was coming from the point of view that Geoserver (and I suspect other WFS software) only allows you to use a single table (or potentially view) at a time. So you're somewhat hamstrung in what you could do on the database side anyway. From an Oracle POV views are often not particularly brilliant performance wise. The most efficient solution in terms of speed would be materialised views (but that's Oracle specific) or simply in new tables to hold the data, but again you start getting into issues of database maintenance. -- StuartGirvan - 20 Dec 2004

A few additional perspectives here, that are particularly relevant to the nature of data access services:

  1. The geotools codebase doesnt yet know how to handle views (though we could have fixed this)
  2. the project's implementation would allow us to exploit views with a trivial mapping anyway
  3. There is a real need to be able to restrict possible queries to create robust implementations
  4. backend SQL statements should be able to optimise SQL prepared statements in the same way they would any other, including that in a view
  5. the longer-term view would definitely support sequences of statements, not just a single one, so a view would soon become limited. Denormalisation is only part of the process - we want to be able to handle highly normalised solutions.
  6. not all databases support views. Its possible to communicate via JDBC/ODBC bridges to flat file, XLS documents and other wierd stuff.

From my own experience of analysing large amounts of monitoring data in SQL (10 years worth of water quality readings from ten different agencies, tested against a range of criteria including sampling frequency, missing gaps in time series, percentile compliance against sliding thresholds ) I found it necessary to carefully construct specific views that were simple enough I could predict/test how they would be optimised, and judiciously use intermediate results tables where an optimisation was too weak.

Its quite possible that an organisation might want to test out "views" as SQL statements before committing them to the corporate repository, since a view does impact on the system - it increases the number of artefacts that must be documented, requires change control processes since multiple application may be using it, and increases the complexity of the discovery process (which of these 3000 views is the one I want?)

-- RobAtkinson - 21 Dec 2004

Suggested Additions

Diagrams

Some architecture diagrams explaining how everything fits together, eg:

  • Overall architecture:
    MCAOverallStructure.png

Good idea. Will add in. -- StuartGirvan - 20 Dec 2004

Fragile Areas

I'm not sure I understand the entire architecture but it seems we have a Java-based XSLT transform occurring in the WFS which is vulnerable to query result size?

||| WFS uses a SAX based stream to map the JDBC result set into the final XML - this shouldnt be too fragile. Currently the client is being a bit dumb - its using a DOM based XSLT transform, but this shouldnt be an issue since it will handle any amount of data meaningful to the end user, which is constrained by at least three factors:
  1. how long they are prepared to wait
  2. how much data can meaningfully be displayed in a map
  3. to what extent can individual records meaningfully be discriminated

So, there should be no inherent engineering limitations that matter - other than perhaps query speed and network speed if the information architecture is not suited to the application workflow.

Reports, service chaining and data ordering processes all need to be asynchronous and volume-safe.

-- RobAtkinson - 21 Dec 2004 |||

Client Restrictions

Currently the restriction of using SVG for rendering map layers with some interactivity requires using the MS Windows version of Internet Explorer with the Adobe SVG plugin.

Will add in. -- StuartGirvan - 20 Dec 2004

Roadmap for Rolling Changes back into Community Tools

Geoserver Tools

Very difficult for me to comment on this and I'd be surprised (but delighted) if SCO have a plan to do this. Rob? -- StuartGirvan - 20 Dec 2004

General XMML Schema Issues

Choice of Patterns

I didn't think there was a clear statement of the issues with alternative patterns of nesting being equally valid but requiring different XSLT processing, as discussed on XsltProcessingGml.

That applies to all clients who want to use the data. I did mention this in the XMML section but if you've got more/better words put them here and I'll be happy to incorpoarte them. I've already tried to do a little bit to draw this out. -- StuartGirvan - 20 Dec 2004

Upper vs Lower Case first letter of Elements

There's a well-defined pattern for the initial letter of an XML element being upper or lower case, based on the role of the element in the schema as a Feature or Property.

This is generally quite obvious when you see enough context of the schema in a UML diagram.

Getting the case right has proven to be a source of considerable confusion when people are working at the database mapping level and there's not necessarily a clear pattern that can be used other than knowing your schemas in fine detail or stepping back and looking at the overall diagram.

The most useful debugging technique for catching these was having XSLT stylesheets written to the independent inline examples in the XMML repository and then applying those stylesheets to the WFS output. This technique is unlikely to scale well and a more automated checker is a desirable goal.

Addes a slightly modified version of this into the document.-- StuartGirvan - 20 Dec 2004

( AndyDent thinks the minor semantic clues given by the different case aren't worth the cost of the confusion! )

XML is case sensitive, and different DBs have different ideas, so this issue is the same as any other issue - the WFS as a bridge between the database and the XML must support the mappings, and it must be configured correctly.
Topic attachments
I Attachment Action Size Date Who Comment
MCAOverallStructure.dotdot MCAOverallStructure.dot manage 0.2 K 17 Dec 2004 - 16:10 AndyDent DOT source file for GraphViz to generate diagram
MCAOverallStructure.pngpng MCAOverallStructure.png manage 16.1 K 17 Dec 2004 - 16:12 AndyDent Overall architecture
TechnicalLearningfromtheMCA.docdoc TechnicalLearningfromtheMCA.doc manage 181.5 K 04 Jan 2005 - 05:53 TerryHannant Additions by Terry Hannant
Topic revision: r13 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).