Environment variables for multi-core jobs - Job to Machine channel

DRAFT 0.9 (31/01/2014)

Abstract

This document provides a proposal for the definition of a communication channel from the Job to the Machine. The objective of this communication channel is to provide the machine owner enough information to pick "the best job" to vacate when needed.

The specification is optimized for the pilot use case in mind, but checkpointable user jobs may be a good fit as well.

Introduction

The proposed schema builds on the work done for the Machine to Job communication channel, but are targeting the opposite data flow.

The proposed attributes cover the information that multi-job pilots own as part of their job scheduling activity. Below you can see a schematic view of this information; for more details, refer to the CHEP paper ( published paper , talk )
multi_job_info.png

Most pilot jobs will typically start more jobs, if possible, thus extending any job termination estimates. There are however times when a pilot is ready to be retired, and does not expect more user jobs to be fetched. We provide an attribute to communicate this to the resource owner.

In addition, we want to allow a user to express a preference for one job versus any other job, in the form of a priority number.

Definitions

Environment variables

For each job, one environment variable has to be set, with the following name:

Variable Contents Comments
JOBSTATUS Path to a directory Job specific information
This environment variable is the base interface for the user payload. They must be set inside the job environment.

Directories

The directories to which the environment variable points contains job specific information. The file name is the key, the contents are the values.

Use cases

Use cases to be covered are
Identifier Actors Pre-conditions Scenario Outcome (Optional) What to avoid
9. site site batch system The site wants to know how much longer the job will be running  
10. site site batch system The site wants to know the amount of draining waste, if the job was asked to drain  
11. site site batch system The site wants to know the amount of waste, if the job was killed  
12. site site batch system The site wants to pick the job that is the least critical for the user  

Requirements

  • The propose schema must be unique and leave no room for interpretation of the values provided.
  • For this reason, basic information is used which is well defined across sites.
  • The information is expected to be dynamic.
  • Files will be owned by the user and reside on a /tmp like area.

List of requirements

Job specific information which are:
  • found in the directory pointed to by $JOBSTATUS
  • owned by the user who is executing the original job. In the case of pilots this would be the pilot user at the site.
  • created by the job, and will be updated several times during its lifetime

Identifier File Name (key) Originating use cases Value (Optional) Comments
3.1 used_CPU 10,11,12 Number of used cores by the job. Must be locked before any of the other files in this section are either read or written to.
Must be less or equal than allocated_CPU.
3.2 last_job_start 11 UNIX time (integer)  
3.3 first_exp_job_end 10 UNIX time (integer) Good faith estimate
3.4 last_exp_job_end 9,10,12 UNIX time (integer) Good faith estimate
3.5 last_max_job_end 9,10,12 UNIX time (integer) Enforced limit
3.6 add_uncom_time 9,11 CPU seconds (integer)  
3.7 add_final_exp_waste 10 CPU seconds (integer) Good faith estimate
3.8 can_postpone_last_job 9,10 string, either "True" or "False" If the job decides to revert from "False" to "True", it should not update any of the other values for a significant amount of time.
3.9 priority_factor 12 Integer, higher is better The semantics is user specific, and should not be used to compare jobs of different users.

Notes:

Most of the above values are meant to be used together, so both readers and writers are requested to lock the used_CPU file before either reading or writing any of the files.

Conversion formulas

While the above information may be useful on their own, they are usually combined to produce numbers used to make decisions.
Below are the formulas used to satisfy the originating use cases.

Originating use case Symbolic name Formula (Optional) Comments
9 remaining_time $last_exp_job_end-$now
or
$last_max_job_end-$now
There are two possible formulas, depending on the level of confidence the user has in the estimate
10 draining_waste ($allocated_CPU-$used_CPU)*($first_exp_job_end-$now)+$add_final_exp_waste  
11 kill_waste $add_uncom_time+$used_CPU*($now-$last_job_start)  

External Resources

TBD

Impact

Pilot frameworks and site batch system configurations will have to be modified in order to profit from the above declarations.

Recommendations

Conclusions

The new mechanism allows to propagate basic information from the user payload to the machine owner. The interface is independent of the pilot framework and/or batch system in use.

References

Igor's presentation at CHEP 2013

-- IgorSfiligoi - 31 Jan 2014

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng multi_job_info.png r1 manage 47.9 K 2014-02-08 - 00:33 IgorSfiligoi  
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2014-02-08 - IgorSfiligoi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback