WLCG Operations Coordination Minutes, September 17th 2015

Highlights

Agenda

Attendance

  • local: Maria Dimou (Minutes), Andrea Sciaba (chair), Maarten Litmaath, Andrea Manzi, Marian Babik, David Cameron, Giuseppe Lo Presti EDIT AFTER THE MEETING
  • remote: Alessandra Forti, Antonio Maria Perez Calero Yzquierdo, Christoph Wissing, Frederique Chollet, Felix Lee, Maite Barroso, Michael Ernst, Peter Gronbech, Renaud Vernet, Ult Tigerstedt, Rob Quick, Thomas Hartmann, Alessandro Cavalli, Vincenzo Spinoso, Pepe Flix, Alessandra Doria EDIT AFTER THE MEETING
  • apologies: EDIT AFTER THE MEETING

Operations News

Middleware News

  • Baselines:

  • T0 and T1 services

Tier 0 News

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports

ALICE

  • high activity
  • CERN
    • team ticket GGUS:116095 about expired CRLs on myproxy.cern.ch
      • IPv6 connectivity issue in the Wigner data center was fixed
    • Accessing CASTOR for reading or writing raw data files:
      • Various constructive meetings between ALICE experts and the CASTOR team.
      • Short- and longer-term ideas were discussed.
      • Reco jobs now download the raw data files instead of streaming them.
        • The effect should become visible when more data is ready for processing.
      • Further ideas involving EOS are being investigated.
      • DAQ and CASTOR experts also retraced how a particular file ended up lost.
      • Thanks for the good support!

ATLAS

CMS

LHCb

Ongoing Task Forces and Working Groups

gLExec Deployment TF

  • NTR

HTTP Deployment TF

Information System Evolution


  • WLCG Information System Use Cases document presented at the MB
  • MB gave feedback to work on several areas that need further discussion and agreement within the TF:
    • Future Use Cases: use cases document describes the current interactions with the IS. The TF should now investigate what it is actually needed so that we can better understand how the IS could evolve.
    • Static vs Dynamic: MB would like to see summarised the types of information actually needed by the experiments. Probably a more elaborated version of what it is already summarised in this twiki under Types of Information and focus only in the future use cases.
    • "Indicative pledges" per site in REBUS: The TF requested the MB to include "indicative pledges" per site in REBUS. MB would like to understand why this information is needed and have a concrete proposal on how it will be collected.
    • Installed capacity: a better definition, and maybe also name, is needed for what it is called today "installed capacity". MB would also like to understand why this information is needed and also how it will be collected.
    • T3s and opportunistic resources: it would be good to understand how information is going to be collected from T3s and opportunistic resources.
  • OSG, NDGF and EGI will present their plans to provide information about their resources in the future at the next TF meeting. GOCDB will also present the latest features.

IPv6 Validation and Deployment TF


Update on the status of IPv6 deployment in WLCG (from Bruno Hoeft)

Tier-1
Site LHCOPN IPv6 peering LHCONE IPv6 peering perfSONAR via IPv6
ASGC - - -
BNL not on their priority list
CH-CERN yes yes LHC[OPN/ONE]
DE-KIT yes yes LHC[OPN/ONE]
FNAL yes yes LHC[OPN/ONE] but not yet visible in Dashboard
FR-CCIN2P3 yes yes LHC[OPN/ONE] but not yet visible in Dashboard
IT-INFN-CNAF - yes LHCONE
NDGF yes yes LHC[OPN/ONE]
ES-PIC yes yes LHCOPN
KISTI started but no peering implemented
NL-T1 no peering implemented
TRIUMF IPv6 peering planned at end of 2015
RRC-KI-T1 - - -

Tier-2
Site LHCONE IPv6 peering perfSONAR
DESY yes LHCONE
CEA SACLAY yes -
ARNES yes -
WISC-MADISON yes -
UK sites QMUL peers with LHCONE but not for IPv6
Prague FZU IPv6 still working but the previous contact person left
There are additional IPv6 perfSONAR servers at Tier-2 centres, but not via LHCONE.

Machine/Job Features

Middleware Readiness WG


Multicore Deployment

Network and Transfer Metrics WG


  • OSG perfSONAR datastore entered production on 14th of Sept providing storage and interface for all perfSONAR results.
  • Publishing of the perfSONAR results using pre-production (ITB) services was successfully established, working to resolve issue with some event types not being published, production still pending SLA.
  • WLCG-wide meshes campaign with latency testing ramped up to 81 sonars caused some instabilities of the sonars with 4GB RAM, therefore we have decreased the number of tests performed and this has improved the situation.
  • Final version of the perfSONAR 3.5 is planned to be released on 28th of September and will be auto-deployed to all WLCG instances. There were no issues found in the testbed, but we plan to update couple of production instances in advance to check if everything is fine.
  • ESNet and OSG have started developments on the perfSONAR configuration interface - open source project motivated by the existing version developed for WLCG. There has been also interest from GEANT and ESNet to collaborate on an open source project based on the existing proximity service.
  • Follow up meeting was held to discuss findings of the FTS performance study lead by Saul Youssef (Boston University), new optimization algorithm was proposed and discussed.
  • Next WG meeting will be on 30th of Sept (https://indico.cern.ch/event/400643/)

RFC proxies

  • NTR

Squid Monitoring and HTTP Proxy Discovery TFs

  • Alastair is making progress on the next deliverable (a flexible squid registration exception list), but is not quite ready to put it into production
  • We agreed to change the documentation for squid registration to make it clear that T3s that are not already registered in GOCDB do not have to register their squids to have them monitored, they can send an email and we'll add an exception

Action list

Creation date Description Responsible Status Comments
2015-09-03 Status of multi-core accounting John Gordon ONGOING A presentation about the plans to provide multicore accounting data in the Accounting portal should be presented at the next Ops Coord meeting on October 1st https://indico.cern.ch/event/393617/ since this is a long standing issue
2015-06-04 Status of fix for Globus library (globus-gssapi-gsi-11.16-1) released in EPEL testing Andrea Manzi ONGOING GGUS:114076 is now closed. However, host certificates need to be fixed for any service in WLCG that does not yet work OK with the new algorithm. Otherwise we will get hit early next year when the change finally comes in Globus 6.1. A broadcast message has been sent by EGI.

Specific actions for experiments

Creation date Description Affected VO Affected TF Comments Deadline Completion

Specific actions for sites

Creation date Description Affected VO Affected TF Comments Deadline Completion
2015-09-03 T2s are requested to change analysis share from 50% to 25% since ATLAS runs centralised derivation production for analysis ATLAS - - Unknown ONGOING
2015-06-18 CMS asks all T1 and T2 sites to provide Space Monitoring information to complete the picture on space usage at the sites. Please have a look at the general description and also some instructions for site admins. CMS -   None yet ~10 T2 sites missing, Ticket open

AOB

-- MariaDimou - 2015-09-14

Edit | Attach | Watch | Print version | History: r36 | r24 < r23 < r22 < r21 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r22 - 2015-09-17 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback