TWiki
>
LCG Web
>
LCGGridDeployment
>
WorkerNodeTesting
(2013-01-22,
ChristophWissing
)
(raw view)
E
dit
A
ttach
P
DF
---+!! Worker Node testing for WLCG * Note: write access for external collaborators can be obtained [[WLCGExternalAccounts][here]]. %TOC% ---++ Introduction As of mid September 2012 most of the WLCG sites in EGI are still running the old gLite 3.2 WN version on their worker nodes, despite various issues: * The old GFAL/lcg_util code has known bugs that are only fixed in EMI/UMD releases of the WN. * New products like GFAL2 and features like Xrootd support and federation are not getting real exposure in the production environment. * Developers who implemented new features (often on our request) may become unavailable when the EMI project has ended. * It becomes hard to maintain the old build infrastructure and expertise for security patches, should they be needed. * Even though the old code may be "good enough" for _current_ usage by ATLAS and CMS, it certainly is not for the many other VOs that most EGI sites need to support. ALICE and LHCb are much less affected, at least for SL5, because their jobs bring themselves essentially all they need. For SL6 porting also LHCb will benefit from corresponding test queues. In the spring of 2012 an initiative was launched to get the EMI-1/UMD-1 WN validated by ATLAS and CMS on a set of sites that together cover all of the relevant SE types: * !BeStMan (as part of EOS) * CASTOR * dCache * DPM * EOS * !StoRM Due to other activities with higher priorities at that time, the validation was only completed partially, allowing e.g. CNAF and a few CMS T2 to move their WN to the EMI-1/UMD-1 release. We now need to restart this activity and keep testing further WN updates regularly, such that we may discover early if a particular update breaks some experiment work flow. The testing would be done through !HammerCloud and participating sites would set up small, essentially permanent test queues for the experiments they support and apply WN updates (automatically?) as they appear in the EMI-2 testing repository: * http://emisoft.web.cern.ch/emisoft/dist/EMI/testing/2/sl5/x86_64/ Meanwhile the EMI-2/UMD-2 WN has been released and it has a much longer lifetime than what was tested earlier, so we should concentrate on that now. The OS will be mainly SL5 for the time being. Sites are welcome to join this effort! ---++ Participating sites and queues | *SE type* | *VOs* | *Site* | *CE + queue name* | *WN version* | *ATLAS </br> status* | *CMS </br> status* | *LHCb </br> status* | | CASTOR | atlas, cms | RAL | lcgce03.gridpp.rl.ac.uk:8443/cream-pbs-gridTest </br> lcgce05.gridpp.rl.ac.uk:8443/cream-pbs-gridTest </br> lcgce07.gridpp.rl.ac.uk:8443/cream-pbs-gridTest </br> lcgce08.gridpp.rl.ac.uk:8443/cream-pbs-gridTest </br> lcgce09.gridpp.rl.ac.uk:8443/cream-pbs-gridTest | EMI-WN 2.0.0 | | | | | dCache | atlas, cms | DESY | grid-cr2.desy.de:8443/cream-pbs-emi2-sl6 | EMI-WN 2.0.0 </br> SL6 | | | | | dCache | atlas | TRIUMF | ce1.triumf.ca:8443/cream-pbs-test | EMI-WN 2.0.0 | | | | | DPM | atlas, cms, lhcb | Brunel | dc2-grid-65.brunel.ac.uk:8443/cream-pbs-atlas </br> dc2-grid-65.brunel.ac.uk:8443/cream-pbs-cms </br> dc2-grid-65.brunel.ac.uk:8443/cream-pbs-lhcb | EMI-WN 2.0.0 </br> SL6 | | | | | DPM | atlas, lhcb | Liverpool | hepgrid5.ph.liv.ac.uk:8443/cream-pbs-long | EMI-WN-2.0.0 | | | | | DPM | atlas, lhcb | Manchester | vm3.tier2.hep.manchester.ac.uk:8443/cream-pbs-long | EMI-WN-2.2.0 | | | | | DPM | atlas, cms | Oxford | t2ce02.physics.ox.ac.uk:8443/cream-pbs-shortfive </br> t2ce02.physics.ox.ac.uk:8443/cream-pbs-mediumfive </br> t2ce02.physics.ox.ac.uk:8443/cream-pbs-longfive | EMI-WN 2.0.0 | | | | | !StoRM | atlas, cms | CNAF | ce03-lcg.cr.cnaf.infn.it:8443/cream-lsf-emitest | EMI-WN 2.0.0 | | | | ---++ ATLAS test details * [[https://twiki.cern.ch/twiki/bin/view/Atlas/MWIntegration][ATLAS test details]] ---++ CMS test details * [[https://twiki.cern.ch/twiki/bin/view/CMS/DFSIntegrationEMIMigration][CMS test details]] ---++ Summary of fixes to data management components The [[http://www.eu-emi.eu/emi-2-matterhorn/updates/-/asset_publisher/9AgN/content/update-4-23-10-2012-v-2-4-0-1][latest EMI-2 update]] contains fixes for all known issues related to gfal/lcg_utils and DPM/LFC clients. ---++ Result tables (match your site here!) | *EMI-2 SL5* | *ATLAS* | *CMS* | *LHCb* | *ALICE* | | CASTOR | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | | dCache | %RED% *NOTE* | %GREEN% OK | %GREEN% OK | %GREEN% OK | | DPM | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | | EOS | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | | StoRM | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | * %RED% *NOTE:* %BLACK% ATLAS found =gsidcap= access failing for limited (WN) proxies and opened GGUS:87065 for the dCache developers. * *Fixed* in [[http://www.eu-emi.eu/emi-2-matterhorn/updates/-/asset_publisher/9AgN/content/update-6-26-11-2012-v-2-5-0-1#dCache_v_2_2_5_Task_37911][EMI-2 Update 6]] released *Nov 26*. * Also CMS have seen this issue, but currently no CMS site is using that protocol. * For ATLAS sites where only plain =dcap= is used the Oct release was already OK. * CMS workaround for DPM sites documented [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsT2DPMInstructions#DPM_CMSSW_compatibility_work_AN1][here]]. | *EMI-2 SL6* | *ATLAS* | *CMS* | *LHCb* | *ALICE* | | CASTOR | | | %GREEN% OK | %GREEN% OK | | dCache | | %GREEN% OK | %GREEN% OK | %GREEN% OK | | DPM | | %GREEN% OK | %GREEN% OK | %GREEN% OK | | EOS | | | %GREEN% OK | %GREEN% OK | | StoRM | | %GREEN% OK | %GREEN% OK | %GREEN% OK | * See aforementioned CMS workaround for DPM sites. | *EMI-1 SL5* | *ATLAS* | *CMS* | *LHCb* | *ALICE* | | CASTOR | | %GREEN% OK | %GREEN% OK | %GREEN% OK | | dCache | | %GREEN% OK | %GREEN% OK | %GREEN% OK | | DPM | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | | StoRM | %GREEN% OK | %GREEN% OK | %GREEN% OK | %GREEN% OK | * Note: with EMI-1 an upgrade to =lcg_util 1.13.9= may still be needed. * See aforementioned CMS workaround for DPM sites.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r34
<
r33
<
r32
<
r31
<
r30
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r34 - 2013-01-22
-
ChristophWissing
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback