Middleware validation
This twiki collects information concerning the validation of new versions of the middleware before they are deployed into operation to verify that there are no problems with the experiment workflows.
Testbed
The CMS testbed is composed by these sites. Each site has chosen a service and they commited to install new releases on it, keep it running and interact with CMS and the developers to troubleshoot any problems.
Site |
Product |
Endpoint |
Tested using |
Monitoring links |
Contact |
T1_ES_PIC |
dCache |
|
PhEDEx, HC, SAM |
transfers to PIC , transfers from PIC , jobs |
Antonio Perez-Calero |
T2_FR_GRIF_LLR |
DPM |
llrpp01.in2p3.fr |
PhEDEx, HC, SAM |
transfers , jobs |
Andrea Sartirana |
T2_FR_GRIF_LLR |
CREAM CE |
|
HC, glidein factory?, SAM |
|
Andrea Sartirana |
T2_FR_GRIF_LLR |
gfal2 |
|
PhEDEx |
PhEDEx logs |
Andrea Sartirana |
T2_FR_GRIF_LLR |
WN |
|
HC, SAM |
|
Andrea Sartirana |
T2_CH_CERN |
EOS |
|
PhEDEx, HC, SAM |
transfers , jobs |
Luca Mascetti |
T2_CH_CERN_TEST |
ARC CE + HTCondor |
ce501.cern.ch:2811/nordugrid-Condor-share ce502.cern.ch:2811/nordugrid-Condor-share |
HC, glidein factory |
factory jobs factory |
Iain Bradford Steers |
T2_IT_LegnaroTest |
CREAM CE |
t2-cepp-01.lnl.infn.it:8443/cream-lsf-cms |
HC, glidein factory |
factory jobs |
Massimo Sgaravatto |
T2_UK_LondonBrunelTest |
ARC CE |
dc2-grid-25.brunel.ac.uk:2811/nordugrid-Condor-default |
HC, glidein factory |
factory |
Raul Lopes |
Instructions to add test services (INCOMPLETE)
Storage Element
- Install a SE with at least 1 TB of disk with the same authentication/authorisation settings as the production SE
- Set up the PhEDEx agents for the dev instance
- Register that SE for your site in the dev instance of PhEDEx
- Ask the CMS transfer team to create a link from your site to at least another site (it should be a good site to minimise transfer issues related to it)
- Create in dev a subscription for the LoadTest
- Transfer to the SE the dataset /GenericTTbar/HCtest-CMSSW_7_0_4_START70_V7-v1/GEN-SIM-RECO
- Enable the LoadTest in dev between your SE
- Create a rule in the TFC that will direct jobs to the test SE for just that dataset. This will take care of the Hammercloud tests
Functional tests with SAM (INCOMPLETE)
Requirements:
- the instance must be testable by the preproduction Nagios and invisible to the production Nagios
- the instance should be registered in GOCDB
- the instance must be in the BDII
- the preproduction Nagios should use a different VO feed containing only the services to be tested for middleware validation
- the standard VO feed must not contain the services to be tested for middleware validation
- CEs must not have CEStatus 'Production'
- To test an SE with the org.cms.WN-analysis and org.cms.WN-mc tests, a dedicated CE would be needed. This is probably not acceptable for sites.
To do:
- check what is needed in PhEDEx to test SEs with the CMS SRM SAM tests
Data transfer tests with PhEDEx
- Simply follow the steps outlined before for the Storage Element
Tests with analysis jobs using HammerCloud (INCOMPLETE)
Requirements:
- If testing a SE, simply follow the steps outlined before for the Storage Element
- If testing a CE, the HC tests must run only on the test CE
- The test CE must not be tested by "standard" HC tests used for site readiness
- The test CE must not be reachable by normal analysis and production jobs
To do:
- put the test CE in the glidein factory TestBed
- add to crab.cfg the line
additional_jdl_parameters = +JOB_Is_ITB=True;
Useful links
WLCG Middleware readiness working group
--
JoseHernandez - 31 Jan 2014