Queue simulation in Condor

This recipe explains a method to mimic PBS queues but using Condor as a batch system. There are multiple approaches to simulate that behaviour:

A) Deploying multiple schedulers

A Condor CE is, in fact, a scheduler with a single queue. We can exploit that and set up more than one scheduler, each on a machine on its own. Then the queue part of the CEID (<resource>:<port>/cream-<batchsystem>-<queue>) is used to redirect to the specified scheduler/queue. That is already implemented and works fine.

For instance we can set up 3 schedulers (queue1, queue2, queue3), each on a machine with the same hostname. Then we can send jobs to any of them like this:

glite-ce-job-submit -r vce03.pic.es:8443/cream-condor-queue1 [...] or
glite-ce-job-submit -r vce03.pic.es:8443/cream-condor-queue2 [...] or
glite-ce-job-submit -r vce03.pic.es:8443/cream-condor-queue3

CREAM and the condor_submit.sh script will call condor_submit, passing along the name of the queue as the -name argument. For instance:

condor_submit job.submit -n queue1

And that command will look up for a scheduler in host queue1 and send it the job.

What's more, we don't even have to use the real hostname of the machines. There is an attribute called SCHEDD_NAME that defines the name of the scheduler. For example, we can implement queue1 on top of a machine called machine1. We only have to define

SCHEDD_NAME = "queue1"

and then send the jobs as

glite-ce-job-submit -r vce03.pic.es:8443/cream-condor-queue1@machine1

This way, every scheduler/queue can implement different policies regarding regarding its jobs, users, etc.

B) Implement «virtual queues» using attributes (recommended)

A better way of doing this, and more on the philosofy of Condor is using attributes. Condor attributes are the pillar of the entire batch system. They provide a powerful framework to implement extensible, flexible and very fine-grained policies. For this method to work, we have to patch CREAM condor_submit.sh script. In this script, as it has been described in the former point, the queue section of CEID is used as the scheduler. In this approach we will override this feature and add the queue of the CEID as a custom attribute to the job. Then, the job will be submitted to the default scheduler (usually localhost). For instance, a job submitted with this command:

glite-ce-job-submit -r vce03.pic.es:8443/cream-condor-queue1 [...]

condor_submit.sh gets called and generates a submit file that forwards to Condor. In order to let the scheduler to know about the queue the job comes from, we need to insert a custom attribute. This attribute specifies the virtual queue and allows the scheduler to apply the different policies. Before we come to that, condor_submit.sh has to me modified. This are the modifications required:


# Hang around for 1 day (86400 seconds) ?
# Hang around for 30 minutes (1800 seconds) ?
leave_in_queue = JobStatus == 4 && (CompletionDate =?= UNDEFINED || CompletionDate == 0 || ((CurrentTime - CompletionDate) < 1800))

# Add custom queue Attribute
if [ ! -z "$queue" ]; then
    echo "+BatchQueue=\"$queue\"" >> $submit_file

cat >> $submit_file << EOF
queue 1


#echo $queue | grep "/" >&/dev/null
## If there is a "/" we need to split out the pool and queue
#if [ "$?" == "0" ]; then
#    pool=${queue#*/}
 #    queue=${queue%/*}

#if [ -z "$queue" ]; then
#    if [ -z "$pool" ]; then
#       target="-name $queue"
#    else
#       target="-pool $pool -name $queue"
 #    fi


Until here, the jobs are now labeled with an attribute that specifies its «virtual queue» in its submit file. For example, using this attribute (BatchQueue) we can enforce a walltime limit for each virtual queue: Short (2 hours), Medium (12 hours) and Long (48 hours). This is the configuration added in the Condor to apply these policies:

# Queue simulation
IsLongJob   = ( TARGET.BatchQueue =?= "long"   )
IsMediumJob = ( TARGET.BatchQueue =?= "medium" )
IsShortJob  = ( ( $(IsLongJob) == FALSE ) && ( $(IsMediumJob) == FALSE ) )
LongJobWallTimeLimit   = ( 48 * 60 * 60 )
MediumJobWallTimeLimit = ( 12 * 60 * 60 )
ShortJobWallTimeLimit  = (  2 * 60 * 60 )

RemoveLongJob   = ( TARGET.RemoteWallClockTime > $(LongJobWallTimeLimit)   )
RemoveMediumJob = ( TARGET.RemoteWallClockTime > $(MediumJobWallTimeLimit) )
RemoveShortJob  = ( TARGET.RemoteWallClockTime > $(ShortJobWallTimeLimit)  )
SYSTEM_PERIODIC_REMOVE =  ( ( $(IsLongJob)   && $(RemoveLongJob)   ) \
                         || ( $(IsMediumJob) && $(RemoveMediumJob) ) \
                         || ( $(IsShortJob)  && $(RemoveShortJob)  ) )

Note that the jobs which do not specify a BatchQueue attribute are treated as belonging to the Short queue. Finally, there is a question not resolved in this recipe, and it is publishing all this information to LDAP. It has to be worked on. For any questions, send me a email.

-- PauTallada - 04-Nov-2009

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2009-11-06 - PauTallada
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback