glexec/Argus Pilot Service: Home Page

  • Start Date: Tue 24 Nov 2009
  • End Date (tentative): 12 April 2010
  • Description: Pilot Service of glexec/Argus @ FZK, SWITCH, CESNET, SRCE, INFN
  • Coordinator: Antonio Retico
  • Contact e-mail:
  • Status : In Progress
  • Related meetings


Use cases

  • Experiment framework using glexec for production pilot jobs.
  • Test of grid-wise banning feature by OSCT
  • Gathering of requirements and analysis for monitoring tools

Objective and metrics


  1. Chain glexec - Argus demonstrated to interact correctly with LHC Exepriments' frameworks for pilot jobs
  2. Maintenance and operations of the Argus service declared supportable by the sites
  3. OSCT able to ban a user on the whole pilot infrastructure without specific intervention of the site administrators
  4. Collection of exhaustive requirements for the implementation of monitoring tools


Initial plan

Task Owner Start Date Due Date Status
Set-up repositories and documentation SA1, SA3, CNAF 23-Nov-09 24-Nov-09 Done
Preliminary installation (ARGUS, WN, CE) SWITCH 25-Nov-09 27-Nov-09 Done
Core installations (ARGUS, WN, CE) FZK, SRCE, CESNET, CNAF 30-Nov-09 10-Dec-09 In progress

Constraints and milestones

  • kick-off with sites: 25-Nov (11 AM CET)
  • 1st site technically available for Experiments to test (SWITCH): 1-Dec
  • kick-off with experiments: 1-Dec (11 AM CET)
  • All sites technically available for Experiments to test: 15-Jan
  • Indicative start of Alice developments to integrate glexec: 18-Jan
  • Indicative start of CMS developments to integrate glexec: 15-Feb
  • END of activity (proposed): 31-Mar

Technical documentation

Installation Documentation

Yum repo:

Argus service

Worker Node

Computing Element

  • Repository URL : Production repository (I hope)
  • INFO :

Configuration instructions

Both for the Argus service and GLEXEC, YAIM modules are available:

For more fine tunings:

Post configuration tests

In order to test the correct deployment of Argus, after the installation/configuration some basic tests can be done using the pap-admin to store/list/update/remove policies. After this, the pepcli can be used to test authorization requests/responses. pap-admin and pepcli are documented in the Argus main twiki.

In order to test the interaction glexec-Argus do something like this from a whitelisted account on the Worker Node:

export X509_USER_PROXY=<target_proxy>
$GLITE_LOCATION/sbin/glexec /usr/bin/whoami
And verify that the returned user is the mapped one.

Configuration requirements for sites supporting Atlas

  • if a myproxy server is used to pass the credentials, myproxy-logon has to be installed on the WN (it should be the default in production by now)
  • if a plain proxy is retrieved, and adding voms attributes on the WN is needed, the vomses file has to be reachable from the WN.
  • both the roles atlas:/atlas/Role=production and atlas:/atlas/usatlas/Role=pilot need to be enabled to submit to the queue

General documentation (user guides)

Test documentation

Summary of the load and aging tests done before the certification

  • Load tests:
    • Service Host: 1x 2.33GHz CPU, 1gig ram
    • client recreated - simulates what glexec would do
      • ~60 req/sec, ~160ms (limited by spawning processes)
    • client reused - simulates what CREAM/WMS would do
      • ~240 req/sec, ~120ms
    • client reused, repeat request - simulates pilot jobs
      • ~1000 req/sec, ~37.6ms
  • Aging tests:
    • Test operation over several days with several mio requests
    • Memory usage: stable

Pilot Layout


Argus one virtual machine with SL5 64 bit, installed from the following repository$basearch/ (an alternative is now available here

CE One lcg-CE ( with two WNs (SL5 64 bit): all as vmware virtual machines. VOs enabled: dteam,dech,ops,atlas (no atlas software installed). The WNs were installed pointing to the following repository

CE endpoint


Argus one virtual machine with SL5 64 bit

CE All CEs at GridKa are usable but please refer to The queue must be "pps". All software installations are available on the PPS WNs. Enabled VOs: alice, atlas, cms, dteam, lhcb and ops.

WNs The PPS cluster has been extended to 300 cores

For alice a separate VOBox is available:


Argus one virtual machine on SL5/64bit

CE One CREAM CE with two virtual WNs (SL5 64 bit). VOs enabled: dteam, infngrid, ops.

CE endpoint


Feedback from the experiments

Comments and issues from operations


The instructions to manually install the Argus compatible WNs are wrong. It is recommended that yaim be used instead.


reduction of the policy refresh time from 4hours to 15 mins requested: Angela opened the bug

List of issues

Issue Reported by Bug(s) Status Open/Closed
(Affects glexec on WMS-->GLEXEC-->CREAM chain): Wrongly configured GLITE_LOCATION makes sometimes impossible the discovery of the glexec executable CERN BUG:62810)fixed with patch 3760 (with provider) open
The default policy refresh time set to 4hours seems too long KIT BUG:62281 To be discussed with JSPG open
PEPd should require client-cert authentication support for connecting pep clients CNAF-T1 BUG:60041 fixed with patch 3536 In certification open

Recommendation for Deployment in production

Final assessment

Tasks and actions:

Actions for SA1 are tracked via the Sa1 Deployment task tracker

Failed to include URL Can't connect to (Bad hostname)

Tasks for other participants are tracked here


Assigned to Due date Description State Closed Notify  
GiuseppeMIsurelli 2010-02-19 Provide a report describing the issues being faced by CNAF for the installation of glexec on the WNs.

INFN-T1 is experiencing a problem on the stability of GPFS interacting with the WN on demand system adopted locally into the resource center.
Since they decided to provide virtual WNs for the pilot, the issue is affecting consequently the deployment of the glexec WN component into the site.

2010-02-19 MaartenLitmaath edit

Assigned to Due date Description State Closed Notify  
GianniPucciani 2010-02-19 Provide functional specification of glexec tests being implemented at SRCE     edit
ChadLaJoie 2010-02-03 Provide instructions on how to preserve local policies during the upgrade of the Argus server to a newer version both in an e-mail to the sites and in the PATCH:3536

this was done on the 2nd of February
This is done now at

2010-03-02 MaartenLitmaath edit


Assigned to Due date Description State Closed Notify  
AngelaPoschlad 2010-02-03 Open a bug to request the reduction of the policy refresh time from 4hours to 15 mins

3-2-10: Angeal opened the bug

2010-02-05 MaartenLitmaath edit
AntonioRetico 2009-12-18 Provide the timeline for an installation of a reasonable scale (>100WNs) to be available to Atlas in order to test glexec in production

Update 18-Dec (Andrea Ceccanti) :
Converging on CANF offering the first large-scale installation. They are currently working to the installation at the T! and they hope to have finished before Christmas or alternatively by the 6th of January in order to be ready by the 15th. If the preliminary tests now undergoing suceed they are Ok to use Argus 1.1 whenit will be in status "Ready for Certification"

2009-12-18 AntonioRetico   edit
Main.SWITCH, Main.NIKHEF, Main.SA3 2009-12-01 Finalise the YAIM configuration for Argus -compatible GLEXEC_WN 2009-12-18 AntonioRetico   edit
GianniPucciani 2009-12-04 enumerate available deployment scenarios and see whether new developments have to be requested (or re-negotiations are needed with the sites)

Update 26-Nov.
After discussion with JRA1 and SA3 it was proposed to extend the support of the clients on SL4 . A new patch has been requested to the developers

Update 1-Dec
During the last meeting Gianni was put in charge to open the bug with the change request

Update 18-Dec (Gianni) :
All new developments are now tracked by bugs

2009-12-08 MaartenLitmaath edit
GianniPucciani 2009-12-01 provide reference for basic testing for site administrators in the twiki

Update 1-Dec :
info now available in #Post_configuration_tests

2009-12-03 MaartenLitmaath edit
AngelaPoschlad 2009-12-01 reply to proposed timelines for FZK

Angela confirmed that staring on the 30th is fine for her

2009-11-26 MaartenLitmaath edit


30-Mar-2010 : Check point (PPIslandFollowUp2010x03x30):

  • Argus 1.1 server part in "Ready for Rollout"

17-Mar-2010 : Check point (PPIslandFollowUp2010x03x17):

  • Installation at CNAF T1 finished
  • Decision to use the production repository for future operations
  • Testing of the OSCT global banning list approved.
  • Pilot end date shifts to the 16th of April
  • Further developments and tests to be followed within the GDB

16-Feb-2010 : Check point (PPIslandFollowUp2010x02x16):

  • Installation at KIT/FZK scaled-up to 300 cores

2-Feb-2010 : Check point (PPIslandFollowUp2010x02x02):

  • All sites will be soon requested to upgrade to the new version of Argus PATCH:3536 . CNAF-T1 will be the first, the other will follow
  • All sites requested to apply the workaround in BUG:62206 in order for the Argus servers to star being published in the information system.
  • Integration works in progress for Alice
  • Integration works confirmed to start at mid February for CMS

18-Dec-2009 : Check point (PPIslandFollowUp2009x12x18):

  • Testing of Argus version 1.1. in progress at CNAF
  • installation in progress at all sites. Platform expected available by the 15th of Dec

1-Dec-2009 : Fist installaiton at SWITCH available for testing

1-Dec-2009 : kick-off with the experiments (PPIslandKickOff2009x12x01)

25-Nov-2009 : kick-off with sites (PPIslandKickOff2009x11x25)

24-Nov-2009 : Pilot Home page created

Edit | Attach | Watch | Print version | History: r35 < r34 < r33 < r32 < r31 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r35 - 2016-07-05 - MaartenLitmaath
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback