"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Data Access Protocol / Web Coverage Services

Contents

Related pages


Introduction

Data sub-setting is the process of extracting a cut down version of data from a larger dataset . Data Access Protocol (DAP) is a simple request-response protocol which uses HTTP to move scientific data from server to client. The data-set covered here are netCDF , BSQ, BIP and BIL datasets.

OPeNDAP is non-profit corporation that steers the growth of the DAP through a software framework that simplifies all aspects of scientific data networking, allowing simple access to remote data .

This page illusrates findings of WCS gateways (clients & servers) for netCDF datasets based on:

Datasets

NetCDF Files

A netCDF file can be described as "self-documenting data". But NetCDF is also:
  • A data model for multidimensional & structured scientific data: variables, dimensions, attributes,coordinates
  • A set of API's (C, Java, Fortan,C++, Python) for data access
  • A reference implementation for the API's

BSQ,BIL & BIP Files

The initials stand for band-sequential, band-interleaved-by-line, and band-interleaved-by-pixel, respectively.The BIL, BIP, and BSQ files are binary files & are not in themselves image formats but are schemes for storing the actual pixel values of an image in a file.

Some of these files are quite large. There is a need to access small subsets of large datasets remotely which can be done using an OPeNDAP client to an OPeNDAP data server.

DAP Servers and Services

There are different DAP servers out there. A more comprehensive list of DAP servers can be found here :Available DAP Servers

Hyrax

  • Supports multiple protocols
    • Data: DAP using HTTP/GET and HTTP/SOAP; Direct access (via HTTP); WCS/WFS funded, in development
    • Catalog: THREDDS; HTML directories
  • Data formats: In binary distribution: NetCDF; HDF4; HDF5; FreeForm; many more available as source code.
  • Includes ASCII data dump, HTML data access form, Info metadata page

Hyrax Architecture

  • Two (or more) cooperating processes:
    • Front-end (OLFS); Java,processes requests,DAP interface
    • Back-end (BES) : C++, builds responses, read(s) data
  • Both parts can be customized
    • Front-end: different network protocols
    • Back-end: different data formats/systems
  • N-Tier design is flexible, secure

Subsetting data with Hyrax

OPeNDAP has sophisticated methods for data subsetting. The first step is to get information about the data.

A user may, however, choose to sample the dataset simply by modifying the submitted URL.This is done with a constraint expression.

Examples of subsetting using constraint expressions:
http://oceans.univ.edu/cgi-bin/nc/expl/buoys.nc?temp
http://oceans.univ.edu/cgi-bin/nc/expl/buoys.nc?temp[1,100,5]
http://oceans.univ.edu/cgi-bin/nc/expl/buoys.nc?u&lat>15.0

The constraint expression user guide can be found here.

An easier way for sampling data without writing constraint expression is to append .html to the URL:

http://test.opendap.org/opendap/data/nc/sst.mnmean.nc.gz.html

THREDDS Data Server

  • Java Servlet network interface
  • Supports multiple protocols
    • Data: DAP; WCS; NetCDF Subset; Direct access (via HTTP)
    • Catalog: THREDDS
  • Data formats: NetCDF; HDF5; GRIB-1,2; NEXRAD; DORADE; BUFR; DMSP; GINI; more in development
  • Can also read from any other DAP server
  • Can serve aggregations

THREDDS Architecture

Thredds data services Bulk File Transfer

  • HTTP Server (any file)
Remote access, subsetting CDM files
  • OPeNDAP (any CDM file)
  • Web Coverage Server (grids)
  • NetCDF Subset Service (grids)
  • Web Map Server (grids)

THREDDS Web Coverage Services

NetCDF Subset Service Reference

The NetCDF Subset Service is an experimental REST web service for subsetting CDM scientific datasets. The subsetting is specified using earth coordinates, such as lat/lon bounding boxes and date ranges. The data arrays are subsetted but not resampled or reprojected thus preserving the original dataset.

Summary of Subsetting Parameters

  • Specify variables
    • var=name of variables, separated by ',' (comma).
    • Example :var=QC,LZT,PQ

  • Specify lat/lon bounding box
    • Example :north=17.3&south=12.088&west=140.2&east=160.0

  • Specify lat/lon point
    • Example :latitude=17.3&longitude=140.2

  • Specify station list
    • stn=name of stations, separated by ',' (comma)
    • Example :stn=KDEN,KPAL,SDOL

  • Time range
    • Example :time_start=2007-03-29T12:00:00Z&time_end=2007-03-29T13:00:00Z (between 12 and 1 pm Greenwich time)

  • Time point
    • Example :time=present

  • Return Format
    • Specify the return format(s) that you want by using the accept parameter
    • Example :accept=application/x-netcdf requests a netcdf file

Resources

Generally the resource URLs look like:

http://servername:8080/thredds/ncss/{path/dataset}
http://servername:8080/thredds/ncss/grid/{path/dataset}

The view of a resource (subset of dataset):

http://servername:8080/thredds/ncss/{path/dataset}?{subset}}

The desired representation of the resource is specified using the accept parameter:

http://servername:8080/thredds/ncss/{path/dataset}?{subset}&accept={mime-type}
Topic revision: r18 - 02 Nov 2010, JacquelineGithaiga
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).