PPS Pilot Follow-up Meeting Minutes Tue 09 Jun 2009

  • Date: Tue 09 Jun 2009
  • Agenda: 60068
  • Description: Pilot of glexec/SCAS: check-point
  • Chair: Antonio Retico
  • Home: PpsPilotSCAS

Attendance

  • PPS: Antonio Retico
  • SA3/Certification: Andy Elwell
  • JRA1/Development: Oscar Koeroo; Mischa Salle
  • Atlas: Jose Caballero, Maxim Potekhine
  • LHCb: Absent
  • IN2P3: Absent
  • Lancaster: Peter Love
  • FZK: Angela Poschlad

Review of action items (tasks)

SA1/SA3 tasks

Status of the subtasks of TASK:8986 (see them in the PPS tracker ) .

Covered site-by site (tasks to be updated)

other tasks


Status and results of the pilot service (by VOs and sites)

Atlas

Jose: at LANCASTER still glexec rejecting the regenerated proxy in case it contains a VOMS attribute that has expired in the meantime.

Oscar: We are debugging the site. We suspect that the problem is not in glexec but either in the SCAS framework or VOMS proxy or VOMS API. The puzzling thing is that the installation is working fine at FZK and we haven't been able so far to reproduce the issue. The expected behaviour is that glexec fails in a first run when the expired voms attribute is detected and then succeeds in a second run after the attribute is refreshed. The possibility to look at Jose' script was very useful for us to understand what exactly was failing.

Peter increased the log level and is ready for new tests.

LHCb

Update received off-line: no progress due to the STEP09 that has priority. So the testing at IN2P3 and LANCASTER, meant to start the 2nd of June wasn't done. LANCASTER is ready to receive tests from LHCb.

Lancaster

See Atlas report

IN2P3

Absent

FZK

Installation unchanged. No user activity to report

Status and results of the development (by developers)

Antonio: a script was developed last week that enables the end user to wrap/unwrap the environment variable. The plan is to deploy the current version over the existing installation and let the user try it out. This won't stop however the release to production of the current version of the glexec and SCAS currently under certification (PATCH:2973). An initial version of the script was forwarded for information to LHCb (Roberto Santinelli) and Atlas (Jose). There are now indications coming also from the WLCG management to deploy this script as soon as possible and to encourage the framework developers to test it.

Oscar: a new version of the scripts was produced. This is written in Perl for portability reasons. The script is now packaged inserted in the ETICS build system and we are testing it against a vast number of different shells and environments and seems to be working nicely. There are still some tests pending but we expect to be able to produce a patch by tomorrow (10th of June).
The package will contain;

  • 1 script to wrap the environment (to be called by the framework)
  • 1 script to unwrap the environment (to be called by the payload job)
  • 1 script that wraps the two calls and enables to run a command under glexec with the environment managed transparently
  • 1 shell script with usage examples.

Antonio: as soon as the patch will be available we will in short order update our repositories, kindly ask the sites to upgrade and notify the users to test

Antonio asked for clarification about a number of bugs still pending against glexec, LCAS and LCMAPs, some of which quite old and possibly already solved. The list is:

A bug BUG:51480 was opened today to mirror a glexec vulnerability. That was explained by the developers and found to be not affecting the version of glexec under test

Status of the certification (SA3, certification)

Andy: PATCH:2973 is close to certification. The last two comments are meant to clarify an issue seen with CREAM. The CREAM developers are informed and working on it.

Oscar: I am no sure that all the details are clear to CREAM developers because there is missing information in the history of the patch which was exchanged in a mail thread

Andy will follow-up

Open Issues (by VOs, sites, deployment teams)

Three main open issues:

  • deployment at sites running WNs on a user file system (e.g. IN2P3)
  • loss of the environment (solution being proposed)
  • issue with expired VOMS attribute at LANCASTER still to be understood

Recommendations for release and deployment

The version of the glexec and SCAS currently under certification (PATCH:2973) should be released to production as soon as certified, provided that the issue currently affecting LANCASTER is understood and proved of limited impact.

Decision about termination/extension of the pilot

We will have another check-point on the 23rd of Jun at 16:00 CEST

AOB


Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2009-06-10 - AntonioRetico
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback