Accounting record format used in UNICORE

OGF UR format is used as a base. The following list specifies which fields are used with some additional notes. RMS - Resource Management System (Torque, SGE, ...). If no comment is provided then it means that we use the field in a fully natural and obvious way.

Base properties
  1. Record identity generated, unused
  2. GlobalJobId Unicore ActionUUID
  3. LocalJobId RMS job id
  4. ProcessId unused
  5. LocalUserId unix login
  6. GlobalUsername user DN
  7. JobName Unicore submitted job name
  8. Charge unused
  9. Status RMS job status. In case of Torque: D->aborted; E->completed; Q-> queued; S -> started. Other statuses are unused.
  10. WallDuration
  11. CpuDuration
  12. EndTime RMS job end
  13. StartTime RMS job start
  14. MachineName RMS exec hosts. WARNING: we use it in a schema invalid way, as more then once for non serial jobs. It denotes execution node hostname and CPU: hostname/CPUn
  15. Host unused
  16. SubmitHost unused
  17. Queue
  18. Project Name used to carry information about computing grants if available.

Differentiated Properties

  1. Network, Disk unused
  2. Memory used two times to report vmem & mem from RMS
  3. Swap unused
  4. Nodecount, Processors unused, but we can easily add this (one can extract this info from MachineNames)
  5. TimeDuration unused
  6. TimeInstant from RMS: qtime, ctime, etime (in case of Torque)
  7. ServiceLevel unused
  8. Extensions
    1. Resource (desc = batch_server) Carries the batch system server host (or batch system submission host - this is controllable by admin) name.
    2. Resource (desc = exit_status) Job exit status.
    3. Resource (desc = VO) Virtual organisation name.
    4. Resource (desc = attributes) user's SAML attributes
    5. Resource (desc = group) Local user unix group.

Comments

  1. MachineName according to the spec should be used only once and we use it in a wrong way. Host should be used for our purpose of enumeration of all execution slots. But Host on the other hand allows only for hostname value, so there is no way to mark CPUs used on the host. In fact the cumulative number of CPUs per execution node would be sufficient. So this should be resolved, e.g. by extending the Host element with CPUs attribute. If this won't be done we will modify our implementation to use Host and encode CPUs number in description.
  2. We use batch_server as an extension. This is rather wrong: we should rather use MachineName for this purpose. But this requires solution for the above point.
  3. Exit status, VO, group - should be added to the base spec.
  4. In case of VO in UNICORE there may easily happen that the system is only aware of the fact the the user is a member of some VOs. In such a case there are multiple values of the VO element.
  5. Attributes are rather out of scope - this is used mostly for internal purposes, is not anyhow presented.
  6. We don't use SubmitHost, and we would like to. However it is not perfectly clear what to put there: in grid environment this might be client's submission machine or grid job execution service or even grid intermediary service like workflow service/broker or machine from where the job was submitted to the RMS. Therefore we would like to have a clear description of this element (probably saying that this is the RMS submission machine, what is of little interest to us) and another element for specifying grid service(s) that submitted the job.
  7. General note: GridInfrastructure (ARC, gLite, UNICORE, Globus) is another field that we think would be useful.

Example

<ur:Usage xmlns:ur="http://www.gridforum.org/2003/ur-wg"> 
  <ur:RecordIdentity ur:createTime="2011-02-21T10:09:17.027+01:00" ur:recordId="ef4ff33f-7053-484d-8fd3-0d8cc019a4ea"/> 
  <ur:JobIdentity> 
    <ur:GlobalJobId>073c26d8-2a30-4ef8-af1b-2a63c06cc344</ur:GlobalJobId> 
    <ur:LocalJobId>18780</ur:LocalJobId> 
  </ur:JobIdentity> 
  <ur:UserIdentity> 
    <ur:LocalUserId>monitor</ur:LocalUserId> 
    <ur:GlobalUserName>CN=Monitoring AGENT, OU=PL-Grid Internal, O=ICM, C=PL</ur:GlobalUserName> 
  </ur:UserIdentity> 
  <ur:JobName>UNICORE_Job</ur:JobName> 
  <ur:Status>completed</ur:Status> 
  <ur:TimeInstant ur:type="etime">2011-02-21T10:09:17+01:00</ur:TimeInstant> 
  <ur:TimeInstant ur:type="qtime">2011-02-21T10:09:17+01:00</ur:TimeInstant> 
  <ur:TimeInstant ur:type="ctime">2011-02-21T10:09:17+01:00</ur:TimeInstant> 
  <ur:Memory ur:storageUnit="KB" ur:type="shared">0</ur:Memory> 
  <ur:Memory ur:storageUnit="KB" ur:type="physical">0</ur:Memory> 
  <ur:Resource ur:description="attributes">d24:urn:SAML:voprofile:groupl25:/vo.plgrid.pl/ICM_testbed32:/vo.plgrid.pl/ICM_testbed/agentsee</ur:Resource> 
  <ur:Resource ur:description="group">users</ur:Resource> 
  <ur:Resource ur:description="exit_status">0</ur:Resource> 
  <ur:Queue>batch</ur:Queue> 
  <ur:StartTime>2011-02-21T10:09:17+01:00</ur:StartTime> 
  <ur:Resource ur:description="vo">vo.plgrid.pl</ur:Resource> 
  <ur:EndTime>2011-02-21T10:09:17+01:00</ur:EndTime> 
  <ur:WallDuration>PT0S</ur:WallDuration> 
  <ur:CpuDuration>PT0S</ur:CpuDuration> 
  <ur:Resource ur:description="batch_server">plgrid-113.icm.edu.pl</ur:Resource> 
  <ur:MachineName>local-plgrid-126/3</ur:MachineName> 
  <ur:MachineName>local-plgrid-126/2</ur:MachineName> 
  <ur:MachineName>local-plgrid-126/1</ur:MachineName> 
  <ur:MachineName>local-plgrid-126/0</ur:MachineName> 
</ur:Usage>

-- KrzysztofBenedyczak - 21-Feb-2011

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2011-02-21 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback