Accounting record format used in UNICORE
OGF UR format is used as a base. The following list specifies which fields are used with some additional notes. RMS - Resource Management System (Torque, SGE, ...). If no comment is provided then it means that we use the field in a fully natural and obvious way.
Base properties
- Record identity generated, unused
- GlobalJobId Unicore ActionUUID
- LocalJobId RMS job id
- ProcessId unused
- LocalUserId unix login
- GlobalUsername user DN
- JobName Unicore submitted job name
- Charge unused
- Status RMS job status. In case of Torque: D->aborted; E->completed; Q-> queued; S -> started. Other statuses are unused.
- WallDuration
- CpuDuration
- EndTime RMS job end
- StartTime RMS job start
- MachineName RMS server hostname
- Host RMS exec hosts. The description denotes number of CPUs used and lists which slots were used if this information is available. E.g.
description="CPUS=2;SLOTS=0,4"
This field may be used multiple times.
- SubmitHost unused
- Queue
- Project Name used to carry information about computing grants if available.
Differentiated Properties
- Network, Disk unused
- Memory used two times to report vmem & mem from RMS
- Swap unused
- Nodecount, Processors unused, but we can easily add this (one can extract this info from Host fields)
- TimeDuration unused
- TimeInstant from RMS: qtime, ctime, etime (in case of Torque)
- ServiceLevel unused
- Extensions
-
Resource (desc = exit_status)
Job exit status.
-
Resource (desc = VO)
Virtual organisation name.
-
Resource (desc = attributes)
user's SAML attributes
-
Resource (desc = group)
Local user unix group.
-
Resource (desc = infrastructure)
what grid middleware or other infrastructure produced the record. Currently we use two values 'unicore' or 'UNKNOWN'.
Comments
- Exit status, VO, group - should be added to the base spec.
- In case of VO in UNICORE there may easily happen that the system is only aware of the fact the the user is a member of some VOs. In such a case there are multiple values of the VO element.
- Attributes are rather out of scope - this is used mostly for internal purposes, is not anyhow presented.
- We don't use SubmitHost, and we would like to. However it is not perfectly clear what to put there: in grid environment this might be client's submission machine or grid job execution service or even grid intermediary service like workflow service/broker or machine from where the job was submitted to the RMS. Therefore we would like to have a clear description of this element (probably saying that this is the RMS submission machine, what is of little interest to us) and another element for specifying grid service(s) that submitted the job.
- General note: GridInfrastructure (ARC, gLite, UNICORE, Globus) is another field that we think would be useful. Currently we use extension for this purpose.
- UR doesn't specify how to provide the info on CPUs used per host, we believe it should.
Example
<ur:Usage xmlns:ur="http://www.gridforum.org/2003/ur-wg">
<ur:RecordIdentity ur:createTime="2011-02-21T10:09:17.027+01:00" ur:recordId="ef4ff33f-7053-484d-8fd3-0d8cc019a4ea"/>
<ur:JobIdentity>
<ur:GlobalJobId>073c26d8-2a30-4ef8-af1b-2a63c06cc344</ur:GlobalJobId>
<ur:LocalJobId>18780</ur:LocalJobId>
</ur:JobIdentity>
<ur:UserIdentity>
<ur:LocalUserId>monitor</ur:LocalUserId>
<ur:GlobalUserName>CN=Monitoring AGENT, OU=PL-Grid Internal, O=ICM, C=PL</ur:GlobalUserName>
</ur:UserIdentity>
<ur:JobName>UNICORE_Job</ur:JobName>
<ur:Status>completed</ur:Status>
<ur:TimeInstant ur:type="etime">2011-02-21T10:09:17+01:00</ur:TimeInstant>
<ur:TimeInstant ur:type="qtime">2011-02-21T10:09:17+01:00</ur:TimeInstant>
<ur:TimeInstant ur:type="ctime">2011-02-21T10:09:17+01:00</ur:TimeInstant>
<ur:Memory ur:storageUnit="KB" ur:type="shared">0</ur:Memory>
<ur:Memory ur:storageUnit="KB" ur:type="physical">0</ur:Memory>
<ur:Resource ur:description="attributes">d24:urn:SAML:voprofile:groupl25:/vo.plgrid.pl/ICM_testbed32:/vo.plgrid.pl/ICM_testbed/agentsee</ur:Resource>
<ur:Resource ur:description="group">users</ur:Resource>
<ur:Resource ur:description="exit_status">0</ur:Resource>
<ur:Resource ur:description="infrastructure">unicore</ur:Resource>
<ur:Resource ur:description="vo">vo.plgrid.pl</ur:Resource>
<ur:MachineName>plgrid-113.icm.edu.pl</ur:MachineName>
<ur:Host ur:description="CPUS=1;SLOTS=0">wn1080</ur:Host>
<ur:Queue>batch</ur:Queue>
<ur:StartTime>2011-02-21T10:09:17+01:00</ur:StartTime>
<ur:EndTime>2011-02-21T10:09:17+01:00</ur:EndTime>
<ur:WallDuration>PT0S</ur:WallDuration>
<ur:CpuDuration>PT0S</ur:CpuDuration>
</ur:Usage>
--
KrzysztofBenedyczak - 21-Feb-2011
Topic revision: r2 - 2011-03-15
- unknown