PPS Pilot Follow-up Meeting Minutes Wed 01 Jul 2009

  • Date: Wed 01 Jul 2009
  • Agenda: 62783
  • Description: Pilot of glexec/SCAS: check-point
  • Chair: Antonio Retico
  • Home: PpsPilotSCAS

Attendance

  • PPS: Antonio Retico
  • SA3/Certification: Absent
  • JRA1/Development: Oscar Koeroo; Mischa Salle
  • Atlas: Jose Caballero
  • LHCb: Apologise
  • IN2P3: Absent
  • Lancaster: Apologise
  • FZK: Apologise

Review of action items (tasks)

SA1/SA3 tasks

Status of the subtasks of TASK:8986 (see them in the PPS tracker ) .

not covered

other tasks


All open actions at PpsPilotSCAS#Tasks_and_actions were reviewed and closed (see individual comments for the update)

Status and results of the pilot service (by VOs and sites)

Atlas

Jose: I tried Lancaster. surprisingly the mapping (which was correct before) disappeared. The site is informed and it's following up

He would be happy to send more jobs to Nikhef. He asks whether the myproxy client on the WNs is available.

Antonio pointed out that the myproxy client was made available in production with one of the last releases (gLite 3.1 Update47) and it is now likely being rolled-out gradually at the sites

Oscar and Mischa will verify the installation and if needed gently ask to install it.

LHCb Report on the activity at Lancaster sent to the mailing list by Roberto Santinelli



Original Message-----
From: Roberto Santinelli
Sent: Wednesday, July 01, 2009 9:20 AM
To: egee-pps-pilot-scas (SCAS Pilot Service)
Cc: Ricardo Graciani Diaz; Philippe Charpentier; Andrei Tsaregorodtsev
Subject: First summary of LHCb tests on gExec

Dear Angela and Peter, thanks again for having managed to have this first round of "slightly more than" trivial tests from LHCb passing (both at GridKA and Lancaster).

My impressions.

I think that a first message that has to pass through is that it is not so immediate and obvious to configure gLExec/SCAS for a given VO on a site; this is true even if the site had already configured it well for another VO. I'm sure that this becomes even less immediate if special customizations are required too. We had to interact several times (at each site) before getting it working.

A second observation that I am tempted to say is that the new piece of m/w from Oscar works. Nonetheless I have not the full evidence of that. I noticed indeed that for both GridKA and Lancaster (while it was not the case at Lyon) there was not really the need to invoke it. Non built-in commands like voms-proxy-info were indeed available in the payload shell irrespectively of this script.

I would now pass the ball to Ricardo for a more exhaustive test through the DIRAC development system in order to check the integration and the effective use case for LHCb. He will require to modify slightly the pilot wrapper in order to incorporate this script as per instruction available at https://www.nikhef.nl/pub/projects/grid/gridwiki/index.php/GLExec_Environment_Wrap_and_Unwrap_scripts

Regards,

R.


Oscar commented saying that it is in their best intention to provide exhaustive documentation for the sites including a recipe that allows a suitable configuration valid for all the VOs.

Antonio confirmed that this is actually one of the expected output of the pilot activity and reminded the sites that they were already requested to log the detailed VO-specific configuration actions they performed on the twiki page. This would be good input for the editors of the operations manual .

Lancaster

A report was sent via e=mail by Peter confirming what reported already by Roberto and Jose' and informing that he installed lcas-interface 1.3.11-1 at his site

IN2P3

Not represented.

Oscar mentioned the e-mail sent to Pierre about BUG:50908 here reported for convenience



Original Message-----
From: Oscar Koeroo [mailto:okoeroo@nikhef.nl]
Sent: Monday, June 29, 2009 8:52 PM
To: Pierre Girard
Cc: Jeff Templon; Mischa Salle; Maarten Litmaath; Antonio Retico
Subject: Progress on bug #50908

Bonjour Pierre,

I want to bring bug #50908 to your attention. We really need a discussion going to provide us some insight in the flexible setup at Lyon. We have insufficient (actually no significant) input to apply any adjustments to gLExec. We wish to help your deployment and adjust gLExec to meet your requirements.

We need your input to do this as you're our only contact point for this task.

As Lyon's deployment seems to be a blocking factor in deploying gLExec in production I hope to get a discussion going on a very short notice to work towards a solution.

cheers,

Oscar


Oscar solicits a more "interactive" discussion to be established with IN2P3 in order to better understand their requirement. The developers don't like the idea that their software cannot be deployed at certain sites, as it was said at the GDB and are eager to find a solution.

Antonio agrees. Limiting the interactions to the phone conferences is not effective for "hard" debugging

The idea is to organise a focused phone conference (~30 min.) on the topic and then, if needed, think also to face-to-face meeting.

Nick will send an email to Pierre in that order.

FZK

Apologies

Status and results of the development (by developers)

Oscar: we have been doing debugging at LANCASTER for the issue of glexec rejecting the regenerated proxy in case it contains a VOMS attribute that has expired in the meantime. We suspect that the problem is not in glexec but a bug found in the LCAS framework, this is causing a segfault in the VOMS API. VOMS API should fail gracefully so Mischa has opened BUG:52054 against it. Maybe that a workaround is possible at the client side, but investigation is still in progress.

Antonio invited to test solutions at LANCASTER before releasing the patch and demands to be kept informed of the progresses. Until a better understanding of the issue (or of the possible impact in production) is not reached, the newly certified patches 2973 and 2990 will be held in PPS.

Status of the certification (SA3, certification)

PATCH:2973 and PATCH:2990 have been certified and will be soon released to PPS

Open Issues (by VOs, sites, deployment teams)

Two main open issues:

  • deployment at sites running WNs on a user file system (e.g. IN2P3)
  • issue with expired VOMS attribute at LANCASTER still to be understood

Recommendations for release and deployment

The version of the glexec and SCAS currently under certification (PATCH:2973) should be released to production as soon as certified, provided that the issue currently affecting LANCASTER is understood and proved of limited impact.

Decision about termination/extension of the pilot

We will have another check-point on the 1st of July at 16:00 CEST

AOB


Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2009-07-02 - AntonioRetico
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback