"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

Performance Test

ELDA Performance Test

Load Testing Against Elda Based System - from Brian McBride of Epimorphics :

Server Configuration : An Amazon large instance (2 cores/8GB ram)

Test Scenario : Running custom Java application that ran a mix of simulations of users performing different tasks on the system.

Result : Supporting 1024 simulated users with sub 1 second response time on a single machine and 2048 users on a dual machine cluster.

Note : It is important to bear in mind that the performance of a particular system will depend very much on how the overall system is configured and how the API is designed. The performance of the system above improved considerably with some relatively straigtforward adjustments to the configuration.

Observations on How to Configure the System - from Stuart Williams of Epimorphics :

Observed Scenario : Bathing water quality site deployed in UK Environment Agency. The data is relatively slow changing, the triplestore content updated about once daily and more usually on a weekly basis.

Architecture : 'Scalable' architecture with multiple publication servers behind a pair of load balancers. Publication servers can be added or removed base on demand.

Observatons :

We use an Apache configuration to inject and expiry time of now+1hour on LDA generated responses and run a sizeable apache cache on each our publication servers. Given the relatively static nature of the content we are serving this significantly reduces the load reaching ELDA and thence the SPARQL endpoints behind it.

As we are running a clustered configuration, it is also important to get any emitted ETAGs right (ie. consistent across the cluster). Elda can be configured to generate etags based on the response graph content and mediatype of the response [2] - which keeps the etag consistent across clustered response (for the same URI), but different for different media-types and different following content change (with high probability). This is particularly important if you are serving IE clients because they have very dumb web caches - which do not cope well content negotiated responses (responses with VARY headers) - see [1].

We also use the Apache front-end to limit the number of active requests it allows throught to the backend (ELDA) to typically '5'. In particular this then limits the number of SPARQL queries that are inflight behind an elda instance. Given that we have only a single disk-volume supporting the SPARQL engine, there is little point in getting it conjested waiting on disk head movements. What is best to do will vary with SPARQL implementation. The one that we use, Fuseki, runs its indexes as memory mapped files and uses the OS virtual memory system to pull pages in and out of memory. Given a large enough heap, most if not all, of the indexes will become memory resident and disk movement becomes less of an issue.

We also run timeouts on requests with an upper bound at around 60sec - though it is rare that a request will run for that long, not impossible, but rare.

Bear in mind that LDA will fully materialise a result graph in memory (and keep it in a result cache - independent of the apache fronted 'page' cache) before serialising the body of a response. If your are expecting a large response you will have to think abouts its consumption of tomcat heapspace.

HTML responses are slow to generate due to the use of an XSLT transform to create them. Performance can be very dependent on the XSLT engine that you use - both heap/stack usage and throughput.

Lastly, some kinds of API endpoint may be more problematic than others. In the case of the bathing water data, we have a couple of API endpoints that return the nearest bathing waters to some point of interest. The UK national grid is a 1m grid

http://environment.data.gov.uk/id/nearest-bathing-water/easting/{easting}/northing/{northing}

so as you can see small changes in position lead to different URI without a significant change in response. Such endpoints have a real cache defeating effect, because in a mobile client case, the likelihood if you clients being sufficiently co-located to generate a cache hit is virtually zero - and each miss will force a query though to the backend. If this becomes a problem for use, we are likely to uses some crude URI rewriting to do some rounding of request URI to say a 1km grid or even 10km grid and score some cache hits in favour of absolute accuracy wrt to nearest bathing waters..

If your data is mostly static, which would be the case with SISSVoc, your best 'defence' is an apache web cache and you should be able to set the expiry time much longer than the 1 hour that we're using.

Load Testing Against SISSVoc Release 3.1 - from Arya Abdian of BOM :

Virtual Machine Specification : 1 x vCPU Xeon 2.93GHz with 1GB RAM

Test Scenario : 20 concurrent WGET (crawling the entire site) requests, averaging 40 requests per second

Result :
##################################
#  INTERIM LOADGEN STATS REPORT  #
##################################
Test started : 2012-09-17 16:11:32
Current time : 2012-09-17 22:18:22
Co. Instances: 20
Test interval: 293 minutes
Test runtime : 360 minutes
Requests sent: 715517
Requests/min : 2442.04
Requests/sec : 40.70
----------------------------------
HTTP response summary at 293 mins:
365003 (51.01%) - 200 OK
312226 (43.64%) - 302 Found
31951 (4.47%) - 303 See Other
5344 (0.75%) - 404 Not Found
145 (0.02%) - No data received.
848 (0.12%) - Read error (Connection timed out).
715517 (100.01) - (Total)
##################################
#  End of Report                 #
##################################

[1] http://www.fiddler2.com/fiddler/perf/aboutvary.asp [2] http://elda.googlecode.com/hg/deliver-elda/src/main/docs/advanced.html
Topic revision: r4 - 12 Oct 2012, FlorenceTan
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).