Virtual Geophysics Laboratory User Guide

Overview

This is a guide tailored for the end user of the Virtual Geophysics Laboratory (VGL). It is split into two main parts. The first part will explain the general overview of what a typical "Job" or "Workflow" within VGL looks like and how you interact with them. The second part to this guide will cover the specifics of a few workflows explicitly supported by VGL.

Part I General Usage

In general the VGL laboratory workflows can be broken down into three main phases. The first phase involves discovering and then selecting data sets, the second involves building a script to process said data and the third involves collecting/publishing the results of the processing.

Data Selection

When VGL is first loaded you will be presented with the data selection page (see figure 1). It will consist of a list of available data sets and a viewport for visualising the spatial components of those data sets. The datasets will be presented as a set of 'layers' that can be added to the map for visualisation.

fig1.png
Figure 1 - Overview of Data selection

To get more information about any given layer (see figure 2) you can either:
  1. Select the plus icon to expand a longer description of the layer;
  2. Select the data/image icon to find out more information about what data services are powering the layer;
  3. Select the magnifying glass to highlight the layer's spatial region on the map, double clicking the icon will pan the viewport to that spatial region.
fig2.png
Figure 2 - Layer Interactions

Once a layer of interest has been identified it can be visualised on the map by selecting the layer and pressing the "Add Layer to Map" button. The layer's data services will be queried and the responses displayed on the map. The form of the visualisation depends entirely on the data service. For example: layers with a Web Map Service (WMS) will have the appropriate WMS layers overlaid on the map.

Some layers, when added to the map, will have additional visualisation options in the form of filters. If a layer has extra filter options they will be presented in the filter window whenever the layer is selected in the active layers panel (see Figures 3a and 3b).

fig3a.png

Figure 3a - Layer filtering window for geophysics data sets grouped by project type

fig3b.png

Figure 3b - Layer filtering window for a map imagery

After interrogating a layer (and it's data) visually the next step is to 'select' the data so that it can be made available to your upcoming processing job. The process of data selection varies slightly depending on the type of data, the specific selection procedures are documented below.

Selecting Coverage Data

A coverage is defined as a one or more data variables that vary over a continuous spatiotemporal region. Coverages are typically very large data sets that require subsetting in order to process in manageable chunks. VGL allows a coverage to be subsetted spatially by drawing a bounding box on the viewport using a mouse.

Coverage data selection is initiated by clicking on the 'Select Data' button on the viewport (Figure 4a). If the selected spatial bounding box or region has more than one coverage data set, the coverage data selection window will be displayed with a list of available coverage data sets. There is an edit icon for each data set listed. When the icon is clicked, its data set metadata will be displayed (Figure 4b) where you can change some information such as the data format you would like to capture the coverage in, where the data should be stored, etc. The data storage location is important because it's how you will be accessing the data from your job script (more on this later). Remember to click on the 'Save Changes' button once changes have been made on coverage metadata editing window. To make the coverage data sets available for your job script, you will need to select one or more data sets by using the checkbox and press the 'Capture Data' button.

fig4a.png
Figure 4a - Coverage data selection

fig4b.png
Figure 4b - Coverage metadata editing window

Selecting Model Data

Simulation models differ from coverages in that there aren't any data services to subset or query against. Instead the entire model file will need to be downloaded and made available to a job for processing. To select one or more model files you will need to select the spatial region of that model in the viewport. Upon selection you will be shown a popup (Figure 5) containing information about the model, a link back to the library where this model is cataloged and a list of files associated with the model.

fig5.png
Figure 5 - Model Selection

The model files will have an edit icon (Figure 6) that when clicked will show the file metadata (Figure 7) where you can change its location, name and description. The most important piece of information is where should the model file should be stored? The data storage location is important because it's how you will be accessing the data from your job script (more on this later). Remember to click on the 'Save Changes' button once changes have been made on model file metadata editing window. To make these model files available for your job script, you can select all or one or more files by using the checkbox and press the 'Capture selected' button.

fig6.png
Figure 6 - Available model files selection window

fig7.png
Figure 7 - Model file metadata editing window

Job Construction

Once a suitable set of data has been collected, the next step is to build a processing job to actually do something useful with the data you've selected. To access this step select the 'Submit Jobs' link next to the VGL banner. You will be required to authenticate with an Open ID provider before continuing. Please note that all steps in the job construction phase come in the form a 'Task Wizard' where you will be shown a sequence of forms which can be advanced/reversed by pressing the 'Next'/'Previous' buttons.

The first step in creating a new job is to assign it to a series (Figure 8). A series is a way of organizing like jobs for easier access. You can either create a new series here by selecting the 'New Series' radio button or you can select an existing one from the combo box. Selecting an existing series will show you a list of all jobs that currently belong to that series. Right click an existing job to show a list of actions that can be applied to the selected job. Once you are happy with the selected series, press 'Next'.

fig8.png
Figure 8 - Job series selection

The next step (Figure 9) involves adding a brief description of the job you are creating as well as selecting a compute provider, a storage provider, a toolbox and a resource selection.

Compute/Storage Provider is a research or commercial entity that provides computing infrastructure (pool of resources such as compute, storage and networking) to perform computational intensive job or workflow. As of release 1.1, VGL provides compute and storage resources from National Computing Infrastructure (NCI) in Canberra and National eResearch Collaboration Tools and Resources (NeCTAR) in Melbourne.

A Toolbox defines a set of pre-installed software and libraries that will be made available to your processing script (more on this later) at startup. Certain toolboxes will be restricted to authorised users only due to licensing reasons.

The resources selection allows you to choose how much computing power and memory you wish to allocate to this job.

After entering the job details, press Next.

fig9.png
Figure 9 - Job metadata

Now it's time to review the job input files (Figure 10) you selected during the 'Data selection' phase, they should all be listed on this page. You can also add additional inputs in the form of remote HTTP downloads or files uploaded from your PC. If you plan on processing a large dataset it is recommended that you make it accessible via a public URL instead of directly uploading it via this form.

fig10.png
Figure 10 - Job inputs

Finally it's time to define a python script (Figure 11) that will be executed in an environment where it has access to all of the configured input files. The environment executing the script has a few pieces of important information that you should be aware of:
  1. The script will be executed using a Python 2.7 environment.
  2. All input files will be available as soon as the job starts executing.
  3. There will always be a utility program called 'cloud' installed on the PATH for simplifying access to cloud storage. It has the following commands:
    cloud upload [uploadedFileName] [file]
    cloud download [cloudFileName] [outputFile]
    cloud list
  4. You can create as many temporary/output files as you wish until the HDD is filled.
  5. As soon as this script finishes executing the entire environment will be destroyed, including all results. To persist any outputs you will need to upload them using the cloud command.
To aid in the construction of your python script there is also a number of code snippets/templates that can be added to the code window. Most of these snippets are specific to a single workflow and are explained in more details later in this guide.

Once the script has been finalised, hit next and you will have the option of reviewing all of the input files (and the script you just created) on the next form. Pressing 'Submit' will start your processing job. If the submission succeeds, you should be redirected to the job monitoring page.

fig11.png
Figure 11 - Script Builder with example script

Job Monitoring

The final piece of the workflow involves monitoring a job's execution and it's outputs. Initially you will only be shown a list of series on this page, selecting a series will display the set of jobs belonging to that series. Selecting a job (Figure 12) will allow you to interrogate the input/output files and execution logs for the job along with the names/descriptions configured during job creation. A Job whose status is 'Pending' or 'Active' may continue to create output files so the displayed list of files may NOT be exhaustive.

fig12.png
Figure 12 - Job Monitoring

When a job is 'Pending' or 'Active', you can choose to cancel its execution by first selecting the job to be cancelled and invoke the 'Cancel job' action from either the jobs selection panel's Actions dropdown menu (Figure 12b) or the individual job's right click context sensitive menu (Figure 12a). Once the job is cancelled, you can then edit the job and re-submit it for processing. All the output files generated by previous execution will be discarded.

fig12a.png fig12b.png
Figure 12a - Cancelling a 'Pending' job by using
context sensitive menu
Figure 12b - Cancelling a 'Pending' job by using
dropdown menu
If the results of a job a worth keeping then select the 'Register to GeoNetwork ' button. A job registration details window (Figure 13b) will then be displayed for user to enter his/her contact and other details associated with the job. These details will be stored in VGL for subsequent uses. Once the 'Register' button is clicked, VGL will persist the results and generate an ISO19115 metadata record that describes the entire process used to generate the job's results. The resulting metadata record will be stored in an instance of geonetwork associated with VGL. You can access the record (after registration) by selecting the registered job and inspecting the 'Registered URL' detail under the description tab (Figure 13a)

fig13.png
Figure 13a - Job monitoring with a registered Job

fig13a.png
Figure 13b - Job registration details

Finally a job can always be deleted or duplicated by right clicking the job and selecting the appropriate action. A job with 'Saved' status cannot be duplicated. Duplicated jobs will duplicate all metadata and remote service downloads by default, the remaining input/output files can optionally be copied across into the duplicated job (Figure 14).

fig14.png
Figure 14 - Duplicate job files

Part II Specific Workflows

The following information is usage notes on how to use the various the script snippets/toolboxes within VGL. Please make sure you read and understand the above guide first.

Magnetic/Gravity Inversions

Using UBC-GIF

For these script snippets you will need to ensure that you have selected the coverage to be captured using CSV and the UBC-GIF toolbox. Please note that the UBC GIF toolbox is restricted to licensed users only.

When using the UBC-GIF script templates you will be prompted for the input CSV file and the associated spatial bounds using UTM coordinates. If the bounds were selected using VGL these values will be auto populated. The only remaining fields to be filled out are the sizes of the cells to use during the inversion process.

Using eScript (Gravity Inversions Only)

For this script snippet you will need to ensure that you have selected the coverage to be captured using NetCDF and the eScript toolbox.

When using the eScript script template you will be prompted for the input NetCDF file.

Geodynamics Simulations

Using GOCAD Models + Underworld

For this script snippet you will need to have captured a GOCAD model and associated CSV key describing the various parameters inside the model.

Part III Step-by-Step Tutorials

-- JoshVote - 15 Oct 2012
Topic attachments
I Attachment Action Size Date Who Comment
Tutorial_on_using_Underworld_in_VGL_(Draft_2).docxdocx Tutorial_on_using_Underworld_in_VGL_(Draft_2).docx manage 1042.9 K 12 Nov 2012 - 10:57 RichardGoh Underworld tutorial
fig1.pngpng fig1.png manage 665.3 K 15 Oct 2012 - 09:37 JoshVote  
fig10.pngpng fig10.png manage 14.9 K 15 Oct 2012 - 11:42 JoshVote  
fig11.pngpng fig11.png manage 14.6 K 15 Oct 2012 - 13:18 JoshVote  
fig12.pngpng fig12.png manage 76.8 K 13 Feb 2013 - 15:15 RichardGoh  
fig12a.pngpng fig12a.png manage 76.8 K 13 Feb 2013 - 15:12 RichardGoh  
fig12b.pngpng fig12b.png manage 18.2 K 16 Oct 2012 - 12:33 RichardGoh  
fig13.pngpng fig13.png manage 31.9 K 15 Oct 2012 - 15:21 JoshVote  
fig13a.pngpng fig13a.png manage 50.1 K 13 Feb 2013 - 14:50 RichardGoh  
fig14.pngpng fig14.png manage 27.5 K 15 Oct 2012 - 15:26 JoshVote  
fig2.pngpng fig2.png manage 4.0 K 15 Oct 2012 - 09:37 JoshVote  
fig3.pngpng fig3.png manage 3.8 K 15 Oct 2012 - 09:53 JoshVote  
fig3a.pngpng fig3a.png manage 6.5 K 12 Feb 2013 - 13:14 RichardGoh Project Filter Properties
fig3b.pngpng fig3b.png manage 5.0 K 12 Feb 2013 - 13:15 RichardGoh WMS Filter Properties
fig4.pngpng fig4.png manage 12.9 K 15 Oct 2012 - 10:24 JoshVote  
fig4a.pngpng fig4a.png manage 379.4 K 13 Feb 2013 - 13:06 RichardGoh  
fig4b.pngpng fig4b.png manage 32.2 K 13 Feb 2013 - 11:31 RichardGoh  
fig5.pngpng fig5.png manage 51.9 K 13 Feb 2013 - 09:39 RichardGoh Model Selection Popup
fig6.pngpng fig6.png manage 29.8 K 13 Feb 2013 - 09:57 RichardGoh Available model file selection panel
fig7.pngpng fig7.png manage 23.7 K 13 Feb 2013 - 10:00 RichardGoh  
fig8.pngpng fig8.png manage 16.7 K 15 Oct 2012 - 11:20 JoshVote  
fig9.pngpng fig9.png manage 17.4 K 10 Jun 2013 - 15:29 JoshVote  
Topic revision: r16 - 25 Mar 2014, FlorenceTan
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).