Summary of GDB meeting, July 8, 2015 (CERN)

Agenda

https://indico.cern.ch/event/319749/

Introduction

Next pre-GDB/GDBs

  • September meeting: many topics almost confirmed, agenda may be late due to holidays
  • October meeting jointly with HEPiX at BNL
  • No pre-GDB foreseen in September and October
  • Probably pre-GDBs in November and December: several possible topics including HTCondor workshop, DPM workshop, Cloud and Storage
    • More in September

GDB evolution: still being discussed, more in September

ARGUS: EL7/Java8 support expected in September

  • First tests already done and successful so far: need to build/test the compatibility matrix between server and client versions
  • Packages available for all components

WLCG workshop: will take place in Lisbon first week of February

  • 2.5 day, exact dates still being discussed
  • More focused on in-depth discussion about WLCG future

Actions in progress

  • Multicore accounting: almost done! ~10 sites in WLCG still having problems
    • Demonstrated once again efficiency of GGUS tickets
  • Experiments asked to fill the pages built on "class 2 services" and storage protocols: see slides for URLs

Forthcoming meetings

Security

Update on activities about federated identities - R. Wartel

AARC (H2020 project) started

  • Workplan finalized

Sirtifi: WG on incident response for federations

  • Significant US involvement
  • Links with FIM4R and REFEDS
  • Continue work on trust framework, agree how to express compliance in metadata

WLCG Pilot: currently only ATLAS participating, open to other VOs interested

User mapping difficulties

  • Rely on email and ePPN (eduPersonPrincipalName) received from eduGAIN but ePPN is not guaranteed to be unique for a person over time and is recyclable
    • ePersonUniqueID: non recyclable but nobody implements it
  • Require also to register a nickname (CERN username) in the VOMS server of the VO: registering/checking takes ~9min...
    • Requires a unique STS service shared by all the portals for a VO
    • CERN username as the unique ID to push IOTA DN to VOMS
  • Persistent ID format different from one SP to another one: makes parsing difficult...
  • AARC is supposed to tackle this issue
  • Influence over IDPs is currently limited, many do not implement or release certain identifiers

Next steps

  • More apps
  • Use of ePPN or better when available

New Threats - R. Wartel

Landscape has changed

  • Datacenter security is as important as laptop security
    • Datacenter compromises mainly by administrator credential thefts
  • Linux = Windows as far as security threats are concerned: main attacks target both platforms

Attacks more and more sophisticated

  • Very customized to match information that may be expected by the users
  • Includes fake conferences with sites very similar to an official conference
    • Exemple: ICFNP in Istanbul, RD89 meeting (RD89 doesn't exist)...
    • RD89 example (see slides) very sophisticated with cascaded malicious payloads advertised by an email without any sign of being a malicious email

Angler Exploit Kit: the most advanced/impressive EK available today

  • Payload encryption, AV and VM detection, fileless, daily URL changes...

90%+ of breaches caused by spear phishing

Antivirus now highly ineffective

  • Attackers prepare an undetected variant of the malware and sent a short, high intensity burst of spam: AV cannot cope with such short attacks

Objective: raise the bar as much as we can afford, no perfect security

  • For the most sophisticated attacks and the government security agencies attacks, little chance to win... focus on protecting your people

Security Policies Update - D. Kelsey

Most work common between WLCG and EGI

  • Partly funded by EGI

AUP revision to include all EGI service offerings

  • Infrastructure agnostic
  • Includes requirement to acknowledge support in publications
  • More work needed on data protection issues

VM endorsement and operation: adapt to reflect current use cases

  • Important for EGI FedCloud
  • 2 roles defined: VM Operator (privileged) and VM Consumer (unprivileged)
  • Soon ready for distribution and public comments

Personal Data Protection

  • Originally issues related to X509 DN (accounting) but need to generalise to cover all forms of logging
    • WLCG's global scope makes this harder
  • User need to be informed of Policy whenever they register/use a service
  • GEANT Data Protection Code of Conduct: create a trust framework between SPs and IdPs for IdP to release attributes
  • Transfer data outside EU is even more complex: requires many bilateral contracts between SPs andIdPs
  • Now evaluating the use of a single policy "Processing Personal Data"
    • All EGI/WLCG participants are bound to this
    • "Binding Corporate Rules" for international data transfers
  • EU Data Protection: new regulation supposed to be agreed by the end of the year but currently 3 different drafts from Council of Ministers, Commission and Parliament...
    • At some point, may need to adapt our policy again...

Federated identity relies on IGTF IOTA profile

  • No F2F identity vetting done
  • Robust identification done by the (LHC) VOs
    • But trust is CA is per site and not per VO: need mechanisms to restrict certificates to VO members

Discussion

  • What's the OSG view on this?
    • Dave: working on this, should involve them as early as possible
  • What about Japan/BelleII, since they share a lot of sites and services and are willing to collaborate with WLCG?
    • Dave: let's involve them too - aiming for a single doc with different details per use case (monitoring, accounting...)
  • Remember that access to monitoring info is an operational issue - now sites struggle to see VO monitoring data
    • Dave: policy would enable but not mandate this, should be reviewed on a case by case basis
    • Romain: anonymising data would allow freer publication, currently too much data are either public or private

HEPiX Report - H. Meinhard

Last meeting in March, Oxford: record participation (134 registered) many first timers again

  • Agenda: https://indico.cern.ch/event/346961
  • IPv6 tutorial and Ceph BOF in addition to regular presentations
    • IPv6 tutorial: many participant desktops configured to IPv6 successfully during the tutorial

Storage/FS: confirmed hype around Ceph/CephFS, storage remains a hot topic!

  • Also BeeGFS at DESY
  • AFS seems to have no long term future in HEP, in particular because of the absence of planned IPv6 support

Clouds: increasingly used for HEP workloads

  • Private and public clouds
  • Containers emerging as an interesting technology

Computing/batch

  • Benchmarking: discussion on a fast benchmark for estimate the perf of a given (virtual) machine
    • Use case different from capacity planning
  • HTCondor gaining momentum
  • cgroups support maturing, use increases

SL vs. CentOS: diversity not seen as an issue

Next meeting at BNL: grid/cloud session coorganized with GDB (Wednesday)

  • Large participation from GDB attendees encouraged: attendance to the full week preferred

EGI

ARGO: new monitoring infrastructure - C. Kanellopoulos

Flexible and scalable framework for status, availability and reliability

  • Multi-reports
  • Multi-tenants
  • Modular architecture
  • Integration with external tools
  • Relying on standard components/technologies
  • Developed by GRNET, SRCE and CNRS

Service monitoring still relying on Nagios

  • Using same probe conventions
  • Some add-ons added

Availability and Reliability: several profiles possible

Support several deployment models

  • Distributed monitoring with centralized reports
  • Centralized model
  • Distributed monitoring with local and centralized reporting

Status

  • Run since 1 year in // with the SAM infrastructure: http://argo.egi.eu
    • Comparison showed no major problems
    • Using the Message Broker Network to report results
  • Currently using distributed monitoring with centralized reporting: investigating migration to a fully centralized model

Discussion

  • Topology: from external source, dynamically (daily) updated
    • No built in topology
    • No notion of service, site... built into ARGO: everything defined per customer
    • Several infrastructure can use different topologies: important to support infrastructures as different as EGI, EUDAT, Prace...
    • The same tests can be used differently in different topology context (e.g. global EGI vs. NGI)
    • Topology changes are taken into account transparently for tests run after the change
  • A/R calculation: flexible aggregation of results
    • Tests grouped into groups and aggregation rules defined on groups (AND, OR, percentage of available resources...)
  • Contact between ARGO and SAM3 developers: both projects started at the same time because of the same limitations seen in SAM. Both projects designed along similar principles despite the absence of coordination/discussion. Discussing experience and implementations would be useful.
    • One difference is the deployment model: SAM3 built on a single, centralized deployment model, led to some simplifications. Christos: not much effort put for keeping the current distributed deployment model, most efforts in offering an infrastructure agnostic to infrastructures and customizable to each customer use case.
    • Christos, Julia and Pablo will follow-up offline

FedCloud and Community Clouds - T. Ferrari

FedCloud: open hybrid cloud federation

  • Different levels of federation possible offering various degrees of interoperability
  • Multi-tenant model: common set of procedures, choose the interoperability level that you need
  • Low barrier for joining

FedCloud common services

  • SSO for authn and authz
  • Federated accounting
  • Service registry (GOCDB)
  • Federated information discovery: compute and storage endpoints, list locally available VM images
    • Users query AppDB, Platforms can use LDAP queries
  • Federated monitoring: availability, reliability
  • VM image catalog : EGI endorsed images, VM image management through a central registration point

EGI FedCloud made of Cloud Realms that are subsets of cloud providers exposing homogeneous resources

  • E.g. Open Standard (OCCI) Cloud Realm
  • Mandatory services: AAI, service registry, accounting, monitoring
  • Also support Peer Realms: mandatory services reduced to AAI and policy compliance
    • Base for federating worldwide, e.g. NeCTAR (Australia)

Cloud platforms: community-specific tools/data/apps built on one or several Cloud Realms

  • E.g. VOSpace (defined by International Virtual Obs Alliance), Joint EGI/CANFAR (Canada) effort to build a distributed international cloud for astronomers

Discussion

  • Tiziana: how WLCG could be part of this FedCloud landscape?
    • Maarten: don't see it happening soon for CERN as we are lacking a use case. But WLCG VOs started to make use of FedCloud resources.
    • Michel: positive step forward that FedCloud model now includes this Cloud Realm concept that doesn't make mandatory to embrace all the initial technical choices made by Fedcloud. Should make much easier joining.
      • Tim: still have an issue with OpenStack with the AAI requirement (based on VOMS) as it relies on a component which is not mainstream and is in fact not working with OpenStack versions released in the last year.
    • Ian: need to clarify that there is no concept of a WLCG cloud. There are cloud resources used by WLCG and clouds operated by WLCG sites. Each site may have its own reasons for joining or not joining FedCloud: WLCG has no specific role in this.

IPv6 Update - D. Kelsey

IPv6 important growth according to global Google clients: 35% of Belgium, 20% of USA and Switzerland

  • IPv4 address pool exhaustion progressing everywhere: Europe being the less affected

Recent news from the group

  • Still IPv6 routing problems at CNAF: being worked on
  • F. Prelz started work on XRootD for testbed
  • ATLAS: 2 sites receiving test transfers on IPv6 space tokens
  • LHCb: new DIRAC version released and about to be tested
  • DESY: IPv6 enabled on EDUROAM
  • NDGF about to move to dual-stack

FTS3 testbed

  • gfal-copy fails to use IPv6: seemed coincident with a gfal2 upgrade but turned out to be related to a Globus upgrade
    • Underline that things are not yet robust with IPv6 support

Dual-stack services in production at several sites: need more testing before recommending widely

  • 2% of endpoints in central BDII dual-stacked: slight increase in the last year
  • Also an issue with SAM3: largely linked to IPv6 support not ready in SAM-Nagios
    • The OS is still SL5, which does not fully support IPv6 by default and has indeed failed for us there
    • The SAM port to SL6 is being worked on
    • Not confident yet that there are no other problems: more tests planned next Fall
    • Plan is to have a separate IPv4 and IPv6 infrastructure to help disentangle problems

Experiment news

  • ATLAS: all T1s with a dual-stack perfSonar by last April, all T2Ds by August
    • Still far from it
  • CMS request to have substantial fraction of CMS data available through IPv6 in AAA by the end of the year to enable IPv6-only WNs

LHCOPN: many sites with plans by the end of the year...

Next steps

  • Add XRootD to the testbed
  • More robust production dual-stacked testing
  • Dual-stack SAM testing: probably a showstopper for wider dual-stack production services
  • Another workshop/training event early 2016?
    • pre-GDB?

Discussion

  • Maarten: IPv6 support not yet the highest priority topic but, from what was reported/seen this year, 2016 may be the IPv6 year in WLCG!
  • Michel: progress may not seem very spectacular but we are now in a position where we should be able to do it if there was a need to rush. Most critical issues are now either solved or about to be solved (by the end of the year).
    • When Run 2 reaches a steady state, sites may be more available to tackle the IPv6 issue: main work to be done is sometimes with the basic network infrastructure readiness rather than with grid services.

OSG Update - B. Bockelman

Mistake with time zone: Brian didn't join.

  • Presentation postponed to September

-- MichelJouvin - 2015-07-08

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2015-07-09 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback