TWiki
>
LCG Web
>
LCGGridDeployment
>
GLitePreProductionServices
>
EGEE_PPS_Coordination
>
PreProductionServiceDescriptionDraft
(2013-09-19,
TWikiGuest
)
(raw view)
E
dit
A
ttach
P
DF
[[LCGGridDeployment][LCG Grid Deployment]] - [[GLitePreProductionServices][gLite Pre Production Services]] - [[EGEE_PPS_Coordination][Pre Production Coordination]] ---+!! WLCG EGEE Pre Production: Service Description %TOC% ---# Scope of this document This page describes the internal organisation of the EGEE/WLCG Pre-Production Service. In particular actors, roles, systems, interfaces, workflows and detailed tasks needed for the implementation of the use cases described in PreProductionUseCases are detailed. In addition to that, this document develops also the second major use case of the pre-production, the distributed gLite deployment testing. The document is addressed to ROC managers, site managers in the pre-production orbit, and members of the middleware certification and release teams (SA1/SA3). It provides guidelines for the communication among different partners involved at various title in the pre-production activity. Therefore its final version, as well as each major revision, must be subject to the approval of representatives of: * Operation Coordination Centre (SA1) * Certification and Release teams (SA3) * EGEE ROC Managers (SA1) Minor changes, namely those dealing with purely technical details, may be decided by the Pre-Production Coordination and notified to the concerned partners. ---# General description of the EGEE Pre-Production Service %INCLUDE{ "PreProductionUseCases" section="Mandate" }% The service is organised in two *functional areas* - _Middleware Quality Services_ (MQS) and _Middleware Preview Services_ (MPS) - which are meant to pursue the aforesaid main objectives, and one *support area*, including all the services/activities needed in support to that (e.g. coordination, release management etc.) The three service areas are staffed by the EGEE regions and respond to the PPS Coordination. <dot hideattachments="on" map="on" antialias="on" vectorformat= "jpg ps"> digraph G { node [fontsize="18"] subgraph cluster_prod{ label="production grid" labelloc=b mps [label="Middleware Preview Services", shape=box, filllcolor=plum, style=filled, URL="#MiddlewarePreviewServices"]; } subgraph cluster_pps{ label="pre-production grid" labelloc=b edge [style=dashed] mqs [label="Middleware Quality Services", shape=box, fillcolor=skyblue, style=filled, URL="#MiddlewareQualityServices"]; } coord [label="PPS Coordination", shape=box, fillcolor=yellow, style="filled", URL="#PpsCoordination"]; support [label="PPS Support", shape=box, fillcolor=palegreen, style=filled, URL="#PpsSupport"]; coord -> mps [dir=none color=blue]; coord -> mqs [dir=none color=blue]; coord -> support [dir=none color=blue]; } </dot> ---# Actors and roles Resources to implement the workflows described later on in this document are from: * EGEE/SA1: Operations Teams (OCC + EGEE Regions) * EGEE/SA3: Integration, Testing and Release Teams (CERN + partners in the EGEE Regions ) * EGEE/JRA1: gLite Middleware Development * VOs: represented by EIS team for HEP VOs and EGEE/NA4 for non-HEP ones * TMB: EGEE Technical Management Board The roles are: * PPS Coordinator * Regional Manager * ITR contact: a contact person in the Integration, Testing and Release Team * Developer: a contact person from the developers' teams * Release Manager: a member of SA3 responsible for the content and distribution of the gLite release * PPS Repository Manager: The maintainer of special software repositories used in PPS * PPS _member_ site: a grid site supporting the Middleware Quality Services (deployment testing, release testing. PPS monitoring infrastructure). Sites in this category normally advertise their grid services in the pre-production Information System * PPS _partner_ site: a grid site *in production* supporting the Middleware Preview Services (support to pilots, hosting new client versions). Sites in this category advertise their grid services in the production Information System. They are furtherly categorised into: * Silver Partners (or Silver Sites): Sites supporting the installation of non-backward-compatible client updates * Gold Partners (or Gold Sites): Sites providing support for pilots services in case of backward-compatible server updates * Platinum Partners (or Platinum Sites): Sites providing support for pilots services in case of non-backward-compatible server updates ---# Functional tasks and workflows In this section the functional tasks for sites, regions and coordination bodies are described, as well as the workflows in the context of which they are run. Together with the task description, a basic estimation of the needed effort is given. The estimate provided is based on the past-two-year experiences both from PPS and from the _experimental production services_ activities. Apart from the planning purposes, the basic "value" of a taks will be used in PPS also in order to measure the work performed by the various contributors upon completion of the task, as described in the paragraph [[#PpsAccounting][Activity Management]]. The units used are (FTE = Full Time Equivalent): * *PH* = 1FTE x hour = 1 person hour * *PD* = 1FTE x day (8 hours) * *PW* = 1FTE x 5 working days * *PM* = 1FTE x 20 working days The conversion table is (of course): * PH = 1 person hour * PD = 8PH * PW = 5PD = 40PH * PM = 4PW = 20PD = 160PH #MiddlewarePreviewServices ---## Middleware Preview Services The mission of the Middleware Preview Services is, in general, _to offer previews of new middleware functionalities to interested users_ . More particularly, the middleware previews can be further distinguished into two major classes: * Previews of client tools (changes/additions affecting Worker Nodes, User Interfaces, VOBOXes) * Previews of grid services (a.k.a. _pilot_ or _experimental_ services) Updates and new releases affecting the two classes are treated in two completely different ways as it is detailed in the next sections. New versions of both client tools and middleware services are made available and selectable to users from the VOs in the production environment (using the production information system and accessing production reousces). While for the clients this happens in a semi-automated way, instances of grid services are created only on-demand, meaning that use cases, scope and goals of each pilot have to be agreed in advance between the interested VOs and the PPS. In other words, there is no more a permanent instance of every service made available to users by default as it was the case in the previous implementation of the PPS. A lightweight involvement of the regions is required in support to the process of distributing new clients, basically limited to make a certain number of sites accessible by the automated distribution tools A more significant involvement may be required to the regions in support to pilots of services, namely in order to * set-up the pilot service(s) * manage of the pilot (interface with VOs during the exploitation, wrap-up results) * wrap-up feedback for middleware release team It is however to be noticed that the effort needed to run pilot services is not be allocated by the regions on a permanent way but rather shifted, upon demand, from the production tasks. In order to simplify the process of finding, upon demand, a site available to run a pilot, a pre-registration to certain set of tasks is required. <br> The partner sites are requested to express their potential interest in the pilot activities in advance and are consequently classified as _Silver_ , _Gold_ and _Platinum_ partners, according to the level of support they are ready to make should the need arise; * Silver Partners (or Silver Sites): Sites supporting the installation of non-backward-compatible client updates * Gold Partners (or Gold Sites): Sites providing support for pilots services in case of backward-compatible server updates or new services * Platinum Partners (or Platinum Sites): Sites providing support for pilots services in case of non-backward-compatible server updates When a new pilot is needed the PPS Coordinator will start looking-up for resources from those "bins" ---### Fully backward compatible client update The detailed use case can be read in the [[PreProductionUseCases][Pre-Production Use Cases]] document and is included below. %TWISTY{mode="div" showlink="Show Use Case" hidelink="Hide Use Case" remember="off" firststart="hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%"}% %GRAY% %INCLUDE{ "PreProductionUseCases" section="BcClient" }% %ENDCOLOR% %ENDTWISTY% Workflows and tasks to be detailed ---### Non-backward compatible client update The detailed use case can be read in the [[PreProductionUseCases][Pre-Production Use Cases]] document and is included below. %TWISTY{mode="div" showlink="Show Use Case" hidelink="Hide Use Case" remember="off" firststart="hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%"}% %GRAY% %INCLUDE{ "PreProductionUseCases" section="NbcClient" }% %ENDCOLOR% %ENDTWISTY% Workflows and tasks to be detailed ---### Backward compatible server update The detailed use case can be read in the [[PreProductionUseCases][Pre-Production Use Cases]] document and is included below. %TWISTY{mode="div" showlink="Show Use Case" hidelink="Hide Use Case" remember="off" firststart="hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%"}% %GRAY% %INCLUDE{ "PreProductionUseCases" section="BcServer" }% %ENDCOLOR% %ENDTWISTY% The basic workflow described in the use case above is developed here with explicit mention of the connected tasks. A summary table with the relevant subtasks estimated effort and task rating is given at the end. Steps and the numbers written in %GREEN%green%ENDCOLOR% are relevant only for _minor_ updates.<br> Steps and the numbers written in %RED%red%ENDCOLOR% are relevant only for _major_ updates.<br> Steps written in black are common to the two cases.<br> The initiator or owner of the action is indicated in square brackets at the beginning. 1. [ITR and/or Developer (via EMT), %RED%VO%ENDCOLOR%]: Forward the request for a new pilot to the PPS coordinator * EMT: this is normally the case for minor updates * %RED% VO: if a pilot service is requested by a VO it is normally dealing with a major update of the functionality (or perceived as such by the user). The requested is forwarded either contacting pps-support@cern.ch or during one of the regular operations meetings (WLCG/EGEE Operations meeting, WLCG Service Coordination Meeting) %ENDCOLOR% 1. [PPS Coordinator]: pre-screen available resources%ENDCOLOR% * identify suitable candidates sites among the Gold Sites (or volunteers) * provide information about the new features to the VOs and let them express their interest into participating to the pilot activity. This is done through several channels, e.g. announcement during WLCG/EGEE Operations meeting; broadcast to VO Managers; direct communication to Experiment Integration and Support team (EIS). * restrict the rose of candidates/options and select the sites to run the pilot * verify that adequate documentation is available for installation, configuration 1. %GREEN%[PPS Coordinator]: Contact the Gold Site and give instruction to start the pilot * give pointers to documentation * agree on timelines %ENDCOLOR% 1. %RED%[PPS Coordinator]: Organise a pilot kick-off meeting with the Gold Sites, SA3, the developers (if needed), the VOs%ENDCOLOR% 1. %RED%[Gold Site(s), ITR, PPS Coordinator, VO, Developers] Participate to the pilot kick-off meeting * The goal of the meeting is to reach an agreement about the timeline for the site to set-up the service (e.g. 1 week) and for the VO to give feedback (e.g. 2 weeks). Particular requirements from the VOs are also expressed in this meeting. %ENDCOLOR% 1. [PPS Repository Manager] Set-up a mirror repository. This stem may be not needed in some cases, but tt is likely that, if the service is going to be published in the production information system, the repository used for the installation will be mirrored in order to decouple the production environment from changes possibly happening in the original one provided by the developers. 1. [Gold Site] Set-up the pilot service . It is likely that, if the service is going to be published in the production information system, the repository used for the installation will be mirrored from the original one provided by the developers. In that cas alos the set-up of the mirror repository has to be considered 1. [Gold Site, ITR and/or Developers] : Run the pilot service * The Gold Site manager is generally meant to act as a production *service manager*, the only special commitment with the PPS activity being the prompt reaction in case of problems or in the event of a roll-back * ITR and Developers may be called to help the Gold Site as *service experts* especially in support of new and undocumented features. Sometimes, especially if the Gold Site is "close" to developers, the role may end-up to be covered by the site administrator itself. * Changes in the system however should only be applied by the Gold Site manager, who is finally responsible for the feedback given to the release from the operational point of view * Any subsequent change/issue in the system following the first set-up should be also notified in copy to the PPS Coordinator (through the pps-support mailing list) * The default duration of the pilot (after completed installation) is fixed to %GREEN%1 week for minor updates%ENDCOLOR% and %RED%2 weeks for major updates%ENDCOLOR%. Exceptions or different requirements can be discussed individually during the preliminary phases. The PPS Coordinator is in charge of checking periodically and eventually to send reminders. 1. [Gold Site] : Produce post-mortem pilot report (set-up description, non-functional issues e.g. resource consumption, stability): 4PH 1. [Gold Site, ITR, Release Manager, PPS Coordinator, %RED%VO%ENDCOLOR%, Developers]: a "post mortem" or "wrap-up" meeting is done upon success or expiration of the timeline previously agreed. In this meeting an assessment is done and a decision is made about the follow-up (including decision to prolongate the testing time). Eventually general guidelines for the deployment can be drafted. %GREEN%In case of minor updates this discussion in brought to the EMT%ENDCOLOR% %TABLE{ sort="on" tableborder="0" cellpadding="4" cellspacing="3" cellborder="0" headerbg="#D5CCB1" headercolor="#666666" databg="#FAF0D4, #F3DFA8" headerrows="2" footerrows="1" caption="<i>Tasks and Effort</i>"}% | _Who_ | _What_ | _effort (PD)_ | _Credits_ | _Notes_ | |PPS Coordinator|Pre-screen of resources for pilot start-up|0.75|6| - | |PPS Coordinator|Instruct the gold site to start the pilot|0.25|2| - | |PPS Coordinator|Organise a pilot kick-off meeting|0.25|2| - | |PPS Coordinator, Gold Site, ITR, VO|Participate to kick-off meeting (preparation, attendance, follow-up)|0.25|2| - | |PPS Repository Manager|Set-up a mirror repository|0.5|4| - | |Gold Site|Set-up the pilot service|1.5|12| - | |Gold Site|Provide 1 week of support as service manager|1|8|averaged data from CREAM and WMS experimental services| |Gold Site, ITR or Developers|Provide 1 week of support as service expert|2|16|averaged data from CREAM and WMS experimental services| |Gold Site|Produce post-mortem pilot report (set-up description, non-functional issues e.g. resource consumption, stability)|0.5|4| - | [PPS Coordinator, Gold Site, ITR, VO| Participation to wrap-up meeting, including preparation and follow-up|0.25|2| - | Based on the table above an estimation of the integrated effort to be spent by SA1 to run a 3-week pilot in PPS is Total SA1 effort (for a 3-week pilot): * PPS Coordination: 1.25PD * Gold Site: 5.5PD * _Service expert support: 6PD (to be counted if provided by the Gold Site or SA1 personnel)_ ---### Non-backward compatible server update The detailed use case can be read in the [[PreProductionUseCases][Pre-Production Use Cases]] document and is included below. %TWISTY{mode="div" showlink="Show Use Case" hidelink="Hide Use Case" remember="off" firststart="hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%"}% %GRAY% %INCLUDE{ "PreProductionUseCases" section="BcServer" }% %ENDCOLOR% %ENDTWISTY% #MiddlewareQualityServices ---## Middleware Quality Services ---### Overview Mission: “To test the middleware deployment tools (packaging, documentation) against scenarios relevant for production” * Workload distributed among 'PPS' sites (not registered as production) * Services published in pre-production IS * Testing interaction with different platforms, batch and storage systems * Contributing to interoperability testing * Providing additional info and advice for deployment in production * Dedicated monitoring infrastructure for validation * "service-oriented" testing: several deployment test managers sharing the tools * NOT production-like service --> pre-deployment runs on-demand with releases task for regions * set-up and run test services in different deployment scenarios * set requirements for deployment scenarios coming from the sites and local Vos * evolutionary maintenance of monitoring infrastructure for validation * evolutionary maintenance of automated distribution tools * deployment testing management (run tests and wrap-up feedback, per-service) coordinations tasks * Deployment testing coordination (tools and procedures) ? delegated * Set requirements for deployment scenarios coming from middleware development/certification and "global" VOs * Release management (tools and procedures) ? delegated * Interface to EMT MPS interface with VOs, regions, sites and middleware providers for the definition and kick-off of new pilots ---### Pre-deployment test to be written %TABLE{ sort="on" tableborder="0" cellpadding="4" cellspacing="3" cellborder="0" headerbg="#D5CCB1" headercolor="#666666" databg="#FAF0D4, #F3DFA8" headerrows="2" footerrows="1" caption="<i>Tasks and Effort</i>"}% | _Who_ | _What_ | _effort (PD)_ | _Credits_ | _Notes_ | |SAM Client Administrator|Clone a SAM Sensor|1|8|| |SAM Portal Administrator|Create a display for a new sensor|0.5|4| - | |SAM Client Administrator|Provide 1 week of support as SAM-client service manager|0.5|4|Possibly including minor re-configurations| |SAM Client Administrator|Provide 1 week of support as SAM-server service manager|0.5|4|Possibly including minor re-configurations| ---### Release Testing The _Release Testing_ is a complementary step of the [[PPSReleaseProcedures#PPS_Production][process of releasing middleware Updates to the production system ]]. Before the middleware updates are released to the public, they are delivered to a number of selected production sites. Sites involved in release testing may be part of pools performing similar task at regional level. The sites apply the update to one or more of their grid services and provide feedback to the release managers. Details on the integration of the release testing with the release procedure are available in PPSReleaseProcedures#PPS_Production. %TABLE{ sort="on" tableborder="0" cellpadding="4" cellspacing="3" cellborder="0" headerbg="#D5CCB1" headercolor="#666666" databg="#FAF0D4, #F3DFA8" headerrows="2" footerrows="1" caption="<i>Tasks and Effort</i>"}% | _Who_ | _What_ | _effort (PD)_ | _Credits_ | _Notes_ | |Production Site |Apply a middleware update for release testing |0.125|1| - | ---# Non-functional tasks and workflows #PpsCoordination ---## Coordination #PpsSupport ---## Support Area ---### Release Management #PpsAccounting ---### Activity Management ---### Metrics and Quality ---### Communication #ActivityList ---# Appendix1 - List of PPS pre-defined activities
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r20
<
r19
<
r18
<
r17
<
r16
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r20 - 2013-09-19
-
TWikiGuest
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback