TWiki
>
LCG Web
>
WLCGCommonComputingReadinessChallenges
>
WLCGOperationsWeb
>
WLCGOpsCoordination
>
WLCGOpsMinutes181011
(revision 11) (raw view)
Edit
Attach
PDF
<!-- -- > <font size="6"> %RED% *DRAFT* %BLACK% </font> <br /><br /> <!-- --> ---+!! WLCG Operations Coordination Minutes, October 11th, 2018 %TOC{depth="4"}% ---++ Highlights ---++ Agenda https://indico.cern.ch/event/757611/ ---++ Attendance <!-- -- > *TO BE FIXED AFTER THE MEETING* <!-- --> ---++ Operations News ---++ Special topics ---++ Middleware News * Useful Links * WLCGBaselineTable * [[WLCGBaselineVersions#Issues_Affecting_the_WLCG_Infras][MW Issues]] * [[WLCGT0T1GridServices#Storage_deployment][Storage Deployment]] * Baselines/News ---+++ Important notice concerning the support of TLS v1.2 on WLCG * On Sep 21 a Globus update in the EPEL repositories made TLS =v1.2= <br/> the only version to be supported for security handshakes in GSI. * The concerned package is =globus-gssapi-gsi-13.10= . * Unfortunately, a significant number of grid services in WLCG <br/> were not ready for that change and started running into failures. * We therefore asked for the minimum supported version to be set <br/> to TLS =v1.0= again and we arranged for services like the FTS either *not* to <br/> apply the Globus update yet, or to adjust =/etc/grid-security/gsi.conf= : <verbatim> MIN_TLS_PROTOCOL=TLS1_VERSION_DEPRECATED </verbatim> * Version =globus-gssapi-gsi-14.7-2= has that _temporary_ workaround <br/> and should soon become available in EPEL. * It currently is present in the EPEL-testing repositories. * In the meantime we would like all potentially affected services <br/> to be checked and updated as needed. * Such services may directly depend on *Globus* themselves, <br/> but could also be based on *Java* instead. * Of particular concern are *SRM*, *GridFTP*, *CE* and *Argus* services. * SRM services listen on port 8443 (dCache), 8444 (!StoRM) or 8446 (DPM). * The !CREAM CE service listens on port 8443. * !GridFTP services used by !CREAM, !ARC and SE head nodes listen on port 2811, <br/> while the port may be unpredictable on SE disk servers. * Argus listens on port 8154. * To test SRM, !CREAM, Argus or any other HTTPS service, please run a command like this: <verbatim> openssl s_client -tls1_2 -connect HOST:PORT 2>&1 < /dev/null | egrep '^New|Protocol|known|Bad|refused|route' </verbatim> * The following output is a sign of *failure*: <verbatim> New, (NONE), Cipher is (NONE) </verbatim> * To test a !GridFTP server, one needs a valid !VOMS or grid proxy: <verbatim> env GLOBUS_GSSAPI_MIN_TLS_PROTOCOL=TLS1_2_VERSION uberftp HOST pwd </verbatim> * If any of those commands fails due to the TLS =v1.2= requirement: <br/> please update Java/Globus on the affected service to a recent version, <br/> restart the service and try again. * We will need to set a deadline for TLS =v1.2= support to early 2019 <br/> and will let you know when the timeline has become clearer. * Please report issues you encounter through the usual channels. ---++ Tier 0 News * CERN would like to ask the experiments what notice they would need to have the majority of batch resources here changed to CC7, assuming any intervention would take a couple of weeks to roll-out. An [[https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMinutes180913#Specific_actions_for_experiments][action]] for the experiments has been created ---++ Tier 1 Feedback ---++ Tier 2 Feedback ---++ Experiments Reports ---+++ ALICE * Normal activity levels on average * No major issues ---+++ ATLAS * Smooth Grid production over the last weeks with ~300k concurrently running grid job slots. Additional HPC contributions with peaks of ~50k concurrently running job slots and ~10k jobs from Boinc. * Commissioning of the Harvester submission system via !PanDA is on-going on the Grid. CERN, the TW, ES, IT, UK cloud have largely been migrated. * Heavy Ion throughput tests from CERN point1 to EOS to Tape and 3 Tier1s worked all fine. * The first part of the tape carousel R&D campaign at the Tier1s using 200-300 TB of AOD is finished. Stage-in rate from 300 MB/s to 3 GB/s at the different sites have been observed. ---+++ CMS * LHC running well and CMS is collecting good data, two more weeks of p-p running * heavy-ion P5-->EOS rate test successful on day two * finalizing software and operation model for heavy-ion run in November * stability of EOS fuse mount improved but still encountering read issues (e.g. on 2018-Oct-10) INC:1784940 * two CMS EOS crashes in the last two weeks, ?both on Thursdays? * Fermilab FTS issue traced down to slow CERN-->Fermilab transfers, being investigated, GGUS:137632 * switched from 2017 Monte Carlo configuration to 2018 MC to be the dominant workflow * compute systems busy at above 200k cores, usual mix of about 75% production and 25% analysis ---+++ LHCb * Operations as usual, nothing specific to report ---++ Task Forces and Working Groups ---+++ GDPR and WLCG services * [[https://twiki.cern.ch/twiki/bin/view/LCG/GDPRandWLCG][Updated list of services]] ---+++ Accounting TF * [[https://indico.cern.ch/event/758334/][Accounting Task Force meeting]] took part end of September. CERN accounting problem reported as understood. Fix is being applied. October meeting will be dedicated to HTCondor accounting. * Tape accounting information for all WLCG T1 sites apart of NDGF is integrated in the [[https://monit-grafana.cern.ch/d/000000188/_user-dichrist-wlcg-storage-space-accounting?orgId=6&var-vo=All&var-tier=All&var-country=All&var-federation=All&var-medium=Tape&var-site=All&var-service=All&var-area=All&var-groupby=site&var-binning=1h][WLCG Storage Space Accounting system]] ---+++ Archival Storage WG ---++++ Update of providing tape info <!-- --> __PLEASE CHECK AND UPDATE THIS TABLE__ <!-- --> | *Site* | *Info enabled* | *Plans* | *Comments* | | CERN | YES | | | | BNL | YES | | | | CNAF | YES | | Space accounting info is integrated in the portal. Other metrics are on the way | | FNAL | YES | | | | IN2P3 | YES | | Space accounting info is integrated in the portal. Other metrics are on the way | | JINR | YES | | | | KISTI | YES | | KISTI has been contacted. Will work on in the second half of September | | KIT | YES | | | | NDGF | NO | | NDGF has a distributed storage which complicates the task. Discuss with NDGF possibility to do aggregation on the storage space accounting server side. Should be accomplished by the end of the year | | NLT1 | YES | | Almost done, waiting for opening of the firewall, order of couple of days| | NRC-KI | YES | | | | PIC | YES | | Space accounting info is integrated in the portal. Other metrics are on the way | | RAL | YES | | Space accounting info is integrated in the portal. Other metrics are on the way | | TRIUMF | YES | | | One can see all sites integrated in storage space accounting for tapes [[http://cern.ch/go/Wsv7][here]] ---+++ Information System Evolution TF * Ongoing discussion on the publishing of the CE configuration via JSON file. [[https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit][More details can be found here]] * Storage Resource Reporting implementation by all WLCG storage middleware providers is progressing. [[https://twiki.cern.ch/twiki/bin/view/LCG/StorageSpaceAccounting#SRR_implementation_by_the_storag][More details here]] * Next WLCG IS Evolution Task Force meeting will take place on the 18th of October. Will continue discuss json file structure for CE configuration publishing. UK sites will present their first experience with publishing CE description in json format. ---+++ IPv6 Validation and Deployment TF Detailed status [[WlcgIpv6#IPv6Depl][here]]. ---+++ Machine/Job Features TF ---+++ Monitoring ---+++ MW Readiness WG ---+++ Network Throughput WG <br />%INCLUDE{ "NetworkTransferMetrics" section="11102018" }% ---+++ Squid Monitoring and HTTP Proxy Discovery TFs * LHC@Home is now almost completely switched to using openhtc.io (Cloudflare) cached cvmfs & CMS Frontier services instead of using squids at CERN & Fermilab (except for a small trickle of jobs accessing only /cvmfs/grid.cern.ch). Web Proxy Auto Discovery (WPAD) is used to discover squids when LHC@Home jobs are run at WLCG sites. * Plans are being made to integrate a shoal service (for dynamically registering squids) with the WLCG WPAD service. This is intended for squids running in clouds serving WLCG jobs. We will also exclude the dynamically registered squids from being treated as worker nodes in the failover monitor. ---+++ Traceability WG ---+++ Container WG ---++ Action list | *Creation date* | *Description* | *Responsible* | *Status* | *Comments* | | 03 Nov 2016 | Review VO ID Card documentation and make sure it is suitable for multicore | WLCG Operations | In progress | GGUS:133915 | | 07 Jun 2018 | GDPR policy implementation across WLCG and experiment services | WLCG Operations + experiments | Ongoing | Details [[GDPRandWLCG][here]] | ---+++ Specific actions for experiments | *Creation date* | *Description* | *Affected VO* | *Affected TF/WG* | *Deadline* | *Completion* | *Comments* | | 13 Sep 2018 | moving most of CERN batch to CC7 | all | - | 11 Oct | | how much advance warning needed? | ---+++ Specific actions for sites | *Creation date* | *Description* | *Affected VO* | *Affected TF/WG* | *Deadline* | *Completion* | *Comments* | ---++ AOB -- Main.JuliaAndreeva - 2018-10-08
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r16
|
r13
<
r12
<
r11
<
r10
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r11 - 2018-10-11
-
ChristophWissing
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback