"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

CKAN Spatial Extensions Setup Guide

Contents

Related pages


Overview

In this setup guide, we will walk you through the installation process of two geospatial related CKAN extensions that will add geospatial and remote harvesting capabilities to CKAN. We combined those two separated extensions into one because they are closely related. In particular, the CSW server (cswserver) plugin of CKAN spatial extension requires CKAN remote harvesting extension to be installed and enabled.

Before we continue with the installation of those two extensions (ckanext-spatial and ckanext-harvest), let's briefly explain what those extensions are. In a nutshell, ckanext-spatial is a geospatial related extension for CKAN and it contains a number of plugins such as spatial harvesters, spatial search, csw server, etc. that add geospatial capabilities to CKAN (see its official Geospatial extension for CKAN website for further details). ckanext-harvest, on the other hand, is a harvesting framework for CKAN and it provides command line interface (CLI) and web user interface (WUI) for CKAN to manage harvesting sources and jobs (see its official Remote harvesting extension for CKAN for further details).

Disclaimer: This setup guide is prepared for Geological Surveys in Australia and it doesn't replace the respective CKAN extensions documentation. Should you encounter any issue not mentioning in this setup guide, it is advised that you refer to the official CKAN documentation for help.

Install Geospatial extension for CKAN

Install the Extension and its Required Dependencies

a. Install required dependencies or packages:

$ sudo apt-get install -y python-dev git libxslt1-dev make

b. Activate CKAN's Python Virtual Environment:

$ . /usr/lib/ckan/default/bin/activate

Note: Once the above activate command is executed, you should see CKAN's Virtual Environment name (i.e. default) being added to the front of your shell prompt. Make sure this default environment name present otherwise you'll encounter issue (Python's modules not found) later in your installation process.

c. Install ckanext-spatial extension into your CKAN's Python Virtual Environment (virtualenv):

(default) $ pip install -e git+https://github.com/okfn/ckanext-spatial.git@release-v2.0#egg=ckanext-spatial

Note: As of this writing, it is important that you install release 2.0 of this extension for CKAN 2.0 which we previously installed. We tried the latest extension release i.e. 2.1 with CKAN 2.0 and had encountered issue with its spatial harvesters plugin. That issue may be fixed in future release of ckanext-spatial extension. However, for the purpose of this setup guide, we will stick to release 2.0 of ckanext-spatial extension.

d. Install Python modules required by the extension.

(default) $ pip install -r /usr/lib/ckan/default/src/ckanext-spatial/pip-requirements.txt

Setting up PostGIS

e. Install the correct version of PostGIS depending on your PostgreSQL database version. (you can use this command "psql --version" to find out your PostgreSQL version)

(default) $ sudo apt-get install -y postgresql-9.1-postgis

or

(default) $ sudo apt-get install -y postgresql-8.4-postgis

f. Enable PL/pgSQL language in your ckan_default database (which we created in CKAN Setup Guide). This is required because many of the PostGIS functions are written in PL/pgSQL.

(default) $ sudo -u postgres createlang plpgsql ckan_default

g. Run the following two commands to create the necessary tables and functions in the database and populate the spatial reference table:

(default) $ sudo -u postgres psql -d ckan_default -f /usr/share/postgresql/9.1/contrib/postgis-1.5/postgis.sql

(default) $ sudo -u postgres psql -d ckan_default -f /usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql

Note: Depending on your version of PostgreSQL and PostGIS, the sql scripts may be located on different location. The above two commands will create two tables namely geometry_columns and spatial_ref_sys and also a view called geography_columns.

h. Check to see if PostGIS was properly installed:

(default) $ sudo -u postgres psql -d ckan_default -c "SELECT postgis_full_version()"

You should get something as follows or similar:

                                         postgis_full_version
-------------------------------------------------------------------------------------------------------
 POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009" LIBXML="2.7.8" USE_STATS
(1 row)

i. Change the owner of the above two tables and one view to CKAN's database user (ckan_default) we created in CKAN Setup Guide:

(default) $ sudo -u postgres psql -d ckan_default -c "ALTER TABLE spatial_ref_sys OWNER TO ckan_default"
(default) $ sudo -u postgres psql -d ckan_default -c "ALTER TABLE geometry_columns OWNER TO ckan_default"
(default) $ sudo -u postgres psql -d ckan_default -c "ALTER VIEW geography_columns OWNER TO ckan_default"

Installing libxml2

libxml2 version 2.9 is required for the ISO19139 XSD validation. CKAN probably have installed an older version of libxml2 and we need to update it.

j. Locate the location of libxml2.so file:

(default) $ find /usr -name "libxml2.so"

Note: You should be able to find libxml2.so file in /usr/lib/x86_64-linux-gnu/ directory. Take note of that directory as it will be used as a parameter to configure command later.

k. Download the libxml2 source, store it in your user's home directory and unzip it:

(default) $ cd ~

(default) $ wget ftp://xmlsoft.org/libxml2/libxml2-2.9.0.tar.gz

(default) $ tar zxvf libxml2-2.9.0.tar.gz

l. Configure with the directory of libxml2.so file which we located earlier:

(default) $ cd libxml2-2.9.0/

(default) $ ./configure --libdir=/usr/lib/x86_64-linux-gnu

m. Make it and install it:

(default) $ make

(default) $ sudo make install

n. Check that version 2.9 of libxml2 is installed:

(default) $ xmllint --version
xmllint: using libxml version 20900
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib

Configure Geospatial Extension for CKAN

o. Create database tables required by Geospatial extension. If the following command successfully run, you should see "DB tables created" as below:

(default) $ paster --plugin=ckanext-spatial spatial initdb --config=/etc/ckan/default/production.ini

DB tables created

Note: Should you encounter any errors at this point, check the official CKAN Geospatial Extension troublesheeting section.

p. Enable ckanext-spatial plugins by adding each plugin name to ckan.plugins property in CKAN's ini file. You can find a list of ckanext-spatial extension's plugins names on the official CKAN Geospatial Extension website. The default CKAN installation comes with three plugins enabled namely stats, json_preview and recline_preview. For the purpose of this setup guide, we will leave those plugins as they are and append the following highlighted chanext-spatial extension plugins to your CKAN's ini file (/etc/ckan/default/production.ini) and add in the ckan.spatial.srid option as follows:

ckan.plugins = stats json_preview recline_preview spatial_metadata spatial_query csw_harvester doc_harvester waf_harvester wms_preview spatial_harvest_metadata_api cswserver

## ckanext-spatial Settings
ckan.spatial.srid = 4326

Note: Certain plugin depends on another plugin. For example, spatial_query plugin depends on the spatial_metadata plugin to be enabled. If you want to use additional ckanext-spatial extension's plugin(s) not listed above, check the official CKAN Geospatial Extension website. Also take note that with spatial_metadata plugin, you can define projection (EPSG code such as 4326, 4258, etc.) in which extents are stored in the database with the following option ckan.spatial.srid = 4326 in CKAN's ini file. If ckan.spatial.srid is not specified, the default 4326 will be used.

q. You can configure CSW Server (cswserver) plugin with the following options in your CKAN's ini file (optional):

## CSW Server (cswserver) Settings

cswservice.title = CKAN 2.0 demo - set cswservice.title in config
cswservice.abstract = Unspecified service description - set cswservice.abstract in config
cswservice.keywords =
cswservice.keyword_type = theme
cswservice.provider_name = Unnamed provider - set cswservice.provider_name in config
cswservice.contact_name = No contact - set cswservice.contact_name in config
cswservice.contact_position =
cswservice.contact_voice =
cswservice.contact_fax =
cswservice.contact_address =
cswservice.contact_city =
cswservice.contact_region =
cswservice.contact_pcode =
cswservice.contact_country =
cswservice.contact_email =
cswservice.contact_hours =
cswservice.contact_instructions =
cswservice.contact_role =
cswservice.rndlog_threshold = 0.01
cswservice.log_xml_length = 1000

r. When spatial_query plugin is enabled, it is recommended that you increase the limit on number of datasets searchable with a spatial value in /etc/solr/conf/solrconfig.xml file (for Single Solr instance setup). For multiple Solr core setup, there will be several solrconfig.xml files need to be updated and can be found several levels below /etc/solr.

<maxBooleanClauses>16384</maxBooleanClauses>

s. Let's also modify the default behaviour of Spatial Harvester plugins (csw_harvester, waf_harvester, etc.). By default, the harvesting import stage will stop if the validation of the harvested document fails. The consequence of that is not all metadata will get imported into your CKAN's catalogue. The default behaviour may be acceptable to you depending on your business use cases. However, for the purpose of our demonstration on csw_harvester plugin later in CKANHarvestingGuide, add the following ckan.spatial.harvest.continue_on_validation_errors option into CKAN's ini file. Also, please refer to the official Spatial Harvester plugin website for further details about the harvester's default behaviour.

## Spatial Harvesters (csw_harvester) Settings
ckanext.spatial.harvest.continue_on_validation_errors = True

t. Well done! You should now have CKAN's Geospatial Extension set up. However, the spatial extensions setup is not finished yet. We need to install one more extension and enable its plugins i.e. Remote Harvesting extension (ckanext-harvest) else the CKAN's Geospatial extension's cswserver plugin will not work.

Install Remote harvesting extension for CKAN

Install the Extension and its Required Dependencies

Before you begin this section, ensure that you still have your CKAN's Python Virtual Environment activated.

a. The ckanext-harvest extension can use two different backends namely RabbitMQ and Redis. We will use RabbitMQ backend for our harvest extension (default).

(default) $ sudo apt-get install rabbitmq-server

b. Install ckanext-spatial extension into your CKAN's Python Virtual Environment (virtualenv):

(default) $ pip install -e git+https://github.com/okfn/ckanext-harvest.git@release-v2.0#egg=ckanext-harvest

Note: For the same reason as ckanext-spatial extension (mentioned above), you need to install release 2.0 of ckanext-harvest else you will encounter issue with remote harvesting later on.

c. Install Python modules required by the extension.

(default) $ pip install -r /usr/lib/ckan/default/src/ckanext-harvest/pip-requirements.txt

d. Enable two remote harvesting extension's plugins as highlighted below in CKAN's ini file (/etc/ckan/default/production.ini):

ckan.plugins = stats json_preview recline_preview spatial_metadata spatial_query csw_harvester doc_harvester waf_harvester wms_preview spatial_harvest_metadata_api cswserver harvest ckan_harvester

Configure Remote harvesting extension for CKAN

e. Create database tables required by Remote harvesting extension:

(default) $ paster --plugin=ckanext-harvest harvester initdb --config=/etc/ckan/default/production.ini

f. Restart Jetty and Apache Web Servers:

(default) $ sudo service jetty restart

(default) $ sudo service apache2 restart

g. Check that the harvest page is up in your CKAN instance (replace localhost with your server address):

http://localhost/harvest

You should see the following CKAN harvest page if you have successfully installed the Remote harvesting extension:

ckan-harvest-page.png

h. Check that the CSW (Catalogue Service for Web) Server is up (replace localhost with your server address):

http://localhost/csw?request=GetCapabilities&service=CSW

ckan-cswserver-page.JPG

What's next?

In summary, you now should have both Geospatial extension and remote harvesting extension for CKAN installed.

What we will do next is to show you how you can harvest metadata from GeoNetwork into your CKAN instance using csw_harvester plugin and also how you can harvest metadata in your CKAN into a GeoNetwork instance via cswserver plugin in CKANHarvestingGuide.

 
Topic attachments
I Attachment Action Size Date Who Comment
ckan-cswserver-page.JPGJPG ckan-cswserver-page.JPG manage 164.2 K 31 Aug 2013 - 21:18 RichardGoh CKAN CSW Server page image
ckan-harvest-page.pngpng ckan-harvest-page.png manage 97.9 K 30 Aug 2013 - 17:15 RichardGoh CKAN harvest page image
Topic revision: r8 - 10 Feb 2014, RichardGoh
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).