"Seegrid will be due for a migration to confluence on the 1st of August. Any update on or after the 1st of August will NOT be migrated"

APAC Project Requirements


Job Monitoring

Back to ApacProjectRequirements


Use Cases



* APAC-JobMon-UseCaseDiagram1.gif:
APAC-JobMon-UseCaseDiagram1.gif




CRC's



SERVICE COLLABORATOR
Provides info on Job Status: running, pending, halt, idle, fail  
Provides info on usage (%): cpu(s), disk(s) [Hardware Resource]
Notifies of job Completion or Failure Application Service (ie: SnarkSerivce, EarthByte)
Notifies of reaching progress checkpoints Application Service (ie: SnarkSerivce, EarthByte)
Keep record of when job was submitted, completed (timestamping) Job Management
Provide details about host that is executing job Resource Registry





Sequence Diagrams




* APAC-JM004-SeqDiagram.GIF:
APAC-JM004-SeqDiagram.GIF

  • APAC-JM006-SeqDiagram.GIF:
    APAC-JM006-SeqDiagram.GIF

  • APAC-JM007-SeqDiagram.GIF:
    APAC-JM007-SeqDiagram.GIF


Functional Requirements


  • Users should be able to monitor the status of their jobs online and recieve notification when jobs complete or fail. In addition, they should be able to monitor the performance of their jobs

Requirement ID Child Of Requirement Comment
JM001   Check Job Status  
JM002 JM001 View the Job's Computional time and Computional time remaining User should be able to view how much time (computional time) has been used on a job and how much remains. User will select the job that they require the information on and the system will return the time used and time remaining
JM003 JM001 View the Job's Resource Usage - cpu(s), disk(s) User should be able to view a job's resource usage, as in they will select a job to get information about its resource usage (the % cpu and disk) and the system will return the information in percentages
JM004 JM001 View the Status of the Job (running, pending, halt, idle, fail and % of completion) User should be able to query the status of their job. The user will select a job to query and the system will return the status of the job - either runningm pending, halt, idle, fail and % to completion
JM005 JM001 View job information (host executing on, time submitted, timestamps etc) User should be able to get information about their job. The user will select a job to get job information about, the system will return the information about which host the job is executing on, what time the job was submitted, the ip address it was submitted from and any other timestamps such as completed at ....
JM006   List all currently executing jobs for the logged in user User will query the system to view all their currently executing jobs on the system (the Grid). The system will return a list of all current jobs submitted by the user
JM007   Receive notification of job completion/failure The User will want to recieve notification of the completion/failure of their job, this could be in the form of an email or an sms. The notification must include the Job ID and Job Name and the reason for the notification, whether it has completed or failed


  • Users should be able to examine their job history, along with status information, output to stdout and stderr, and performance information. This capability could be enhanced with the ability to archive associated executables, input files, and output files
Requirement ID Child Of Requirement Comment
JM008   Job History Th user would like to query the system to retrieve a history of all the jobs that they have submitted. The system will retrun a list of the jobs submitted with details including the Job ID and Job Name, whether it completed or failed (output status), when it was started and completed etc...





Non-Functional Requirements




Related Works and Documents




Preliminary Discussions

RobertCheung - 20 Apr 2005 Some things to consider -
  • The use case diagram does not state that there could be sub-job monitors that report back to the main job monitor. This tiered approach is necessary due to the fact that existing resources that we want to make use of might already have their own job monitoring mechanisms. (Eg the HPSC people have their own job queuing and monitoring tools).
  • "push" vs "pull" - we need to be more explicit. A "traditional(basic?)" webservice tend to be a "pull" only system, ie its tasks are only initated upon client request.
    • In APAC-JM004-SeqDiagram.GIF it is a clearly pull paradigm.
    • In APAC-JM006-SeqDiagram.GIF to work around this, you have polling in JMService. JMService then becomes a "push" service. So is JMService both a push and a pull service? Can HardwareResources push? Why/Why not? What about the WebPortal? Why does it not push also (via email perhaps)? There is some inconsistencies here, or at the very least some implicit assumptions about the capabilities of the various components in the sequence diagrams. It will be good if this is made explicit.
  • From the point above regarding push/pull, combined with Functional Requirement JM007, "receive notification of job completion/failure", it sounds like you definitely need a push capability in the "web portal" part of the chain. You might also consider "push" capability along the service chain to prevent polling requirements as per JM006-SeqDiagram.
  • Nit picks:
    • None of the Sequence Diagrams actually shows the results going all the way back to the user. (Does the user never get info back? ;>)
    • The diagramís id (eg JM006-Jobcompletion) does not match the list of functional requirements further down (eg JM006 - "list all currently executing jobs")
  • Non-functional requirements to consider -
    • I suggest that security (AAA) of the system be considers a major part of the non-functional requirements.
    • There should be a list of expected communications channels that the various components should communicate via (eg https? Ssh?) This is important from a practical implementation perspective since some organisational boundaries are tighter than others. (eg someone hosting a WebPortal might be able to communicate with the JMService via https, but not ssh due to organisational constraints). This list does not need to be comprehensive, but by providing a baseline for minimum requirements, more people could see if the design will fulfill their needs.
Topic attachments
I Attachment Action Size Date Who Comment
APAC-JM004-SeqDiagram.GIFGIF APAC-JM004-SeqDiagram.GIF manage 6.7 K 04 Apr 2005 - 16:25 RyanFraser  
APAC-JM006-SeqDiagram.GIFGIF APAC-JM006-SeqDiagram.GIF manage 5.6 K 04 Apr 2005 - 16:25 RyanFraser  
APAC-JM007-SeqDiagram.GIFGIF APAC-JM007-SeqDiagram.GIF manage 7.1 K 04 Apr 2005 - 16:26 RyanFraser  
APAC-JobMon-UseCaseDiagram1.gifgif APAC-JobMon-UseCaseDiagram1.gif manage 9.6 K 18 Mar 2005 - 14:44 RyanFraser  
Topic revision: r13 - 15 Oct 2010, UnknownUser
 

Current license: All material on this collaboration platform is licensed under a Creative Commons Attribution 3.0 Australia Licence (CC BY 3.0).