dCache - Software Verification and Validation Plan
Service/Component Description
Service Reference Card - dCache Server
Deployment scenarios
The following deployment scenario depicts the recommended way of deploying dCache.
Configuration dCache SE DESY
However, if the instance has a large number of pools it is advisable to start the gridFTP and dCap doors on the pools. In case of transfer through doors the ratio pools:door should be 1:1 for each door (dcap, gridFTP).
Functionality tests
Pre-Release Testing:
- When code is committed commit hooks automatically build dCache, dcap and srm client. On error an email is sent to the developer who made the last commit.
- With the last successful build the yum repositories are updated and the rpms automatically installed on some machines.
- Before dCache server rpms are put on our webpage we do an early roleout and are testing the release in the NDGF production system.
In order to test the newly committed code Hudson is used for testing on machines having the latest dCache test release installed. The Test Suite contains:
- Integration Tests: G1 tests, G2 functional tests (lcg-tools)(srmcp, lcg-cp, gsiftp, dcap, gsidcap), availability of serviceports, spacemanager
- S2 tests
- Parallel S2 tests: The parallel S2 (basic and usecase suites) tests run on 7 clients and one server every hour for two days if it is a patch release and for 14 days if it is a minor release.
Features/Scenarios to be tested
emi-1 dCache features: can be found here
JRA1.1 - EMI-1 Development and Test Plans
dCache Features:
supported protocols:
Protocol |
Functional Test |
NFSv4.1 (pNFS) |
CTHON 04, PY-Test |
SRM 2.2 |
G1,G2, S2 |
WebDav (http, https) |
not tested |
dcap, gsidcap |
G1, G2 |
gsiFTP |
G1, G2 |
kerberosFTP |
not tested |
weakFTP |
not tested |
xrootd |
not tested |
other features, which are partially implicitly tested:
- Chimera, PNFS
- Pin manager
- Space manager
- srm-get-space-tokens
- srmcp, srmrm, srmcp space tokens?
- Pool to pool migration
- GLUE info provider
- ACLs
- Webadmin interface
- Command line admin interface
- Resilience with the Replica Manager
- gPlazma authentication
- Stage protection
- Tape protection
'Feature Summary' (implemented/not implemented)
|
Feature |
What it does |
Links |
implemented |
NFSv4.1 (pNFS) server |
Unified namespace allowing striping and separating data and metadata. |
|
implemented |
SRM server |
Server provides access to SRM clients. |
|
implemented |
WebDav (http, https) |
WebDAV |
|
implemented |
dcap, gsidcap |
File operation via dcap door |
|
implemented |
gsiFTP |
File operation via gridFTP door |
|
implemented |
kerberosFTP |
File operation via kerberosFTP door |
|
implemented |
weakFTP |
File operation via FTP door |
|
implemented |
xrootd |
File operation via xrootd door |
|
implemented |
Chimera, PNFS |
Chimera is the recommended database driven namespace provider for providing a single rooted file system view. Operation ls, mkdir, mv, cp, cat are possible. PNFS is the hash table based approach, but does not permit I/O operations |
|
implicit in G2 |
Pin manager |
Preventing files from being deleted from pools when they run out of space |
|
implicit in G2 |
Space manager |
srm-get-space-tokens, srmcp, srmrm, srmcp space tokens? |
|
implicit in G2 |
Pool to pool migration |
Migration of classic SE ( nfs, disk ) to dCache |
|
implicit in G2 |
GLUE info provider |
Publishing resource information for site level and top level BDIIs. |
GLUE info provider |
not implemented |
ACLs |
Permission handler, NFSv4 ACL setting user or group permissions for file-operations. |
dCache book - ACLs in dCache |
not implemented |
Webadmin interface |
Replacement for the httpd service providing site information on a webpage plus some admin functionality |
dCache Webadmin-Interface |
not implemented |
Command line admin interface |
Access dCache site functionality from command line. |
dCache book - The Admin Interface |
implicit in G2 |
Resilience with the Replica Manager |
|
|
implicit in G2 |
gPlazma authentication |
Authentication, Authorization, Mapping (DN <--> UID/GID), Blacklisting |
|
not implemented |
Stage protection |
Protective mechanism only allowing production users (user groups) to trigger a tape operation, which prevents very expensive tape reads. |
|
Normal workflow - correct input
Hudson tests:
G1: Test_Suite_Functional_CTB_cork
dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_srm
- srmcp [client to SE (http)]
- srm-get-metadata (from SE)
- srmcp [SE to client], comparing Md5sums original and file gotten
- srm-advisory-delete
dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_GlobusUrlCopyGridftp
- edg-gridftp-exists (on file that was definitely deleted)
- globus-url-copy [client to SE, gsiFTP]
- edg-gridftp-exists
- globus-url-copy [SE to client, gsiFTP]
- edg-gridftp-rm
- edg-gridftp-exists
- checking Md5sums original and file copied back
dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_GlobusUrlCopyStreams (parallel streams = 1, 5, 9, 13, 17, 21)
- edg-gridftp-exists (test_GlobusUrlCopy.lucan.502.12326.20110211044446.0 does not exist)
- globus-url-copy -p 1 [client to SE, gsiFTP]
- edg-gridftp-exists
- globus-url-copy [SE to client, gsiFTP]
- edg-gridftp-rm
- edg-gridftp-exists
- check Md5sums original file copied back
dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_edgGridftpLsRoot
- globus-url-copy [client to SE, gsiFTP]
- globus-url-copy -nodcau (-no-data-channel-authentication) [SE to client, gsiFTP]
- edg-gridftp-rm
- checking Md5sums original and file copied back
dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_srmMkdir
- srmcp -webservice_protocol=http [client to SE, gsiFTP]
- srmcp -webservice_protocol=http [SE to client, gsiFTP]
- checking Md5sums original and file copied back
- srmrm
- srmrmdir
G2:
- Executing DM-lcg-alias.sh
- Executing DM-lcg-cp-gsiftp.sh
- Executing DM-lcg-cp.sh
- Executing DM-lcg-cr-gsiftp.sh
- Executing DM-lcg-cr.sh
- Executing DM-lcg-list.sh
- Executing DM-lcg-ls.sh
- Executing DM-lcg-rf.sh
- Executing DM-lcg-rep.sh
- Executing DM-lcg-rep.sh
- Executing DM-lcg-get-checksum.sh
S2:
- Executing SRMv2-get-SURLs
- Executing SRMv2-ls-dir
- Executing SRMv2-put
- Executing SRMv2-ls
- Executing SRMv2-gt
- Executing SRMv2-get
- Executing SRMv2-del
If the executed command return successfully the tests are passed.
More tests, partially redundant, are done by the release manager:
SRMcp
Testcase testV2Copy:
- srmcp -2 [client to SE]
- srmcp -2 [SE to client]
- compare checksum original vs. copied back
- srmrm
Testcase testV2CopyMD5:
- srmcp -2 -cksm_type=MD5[client to SE] with MD5 checksum
- srmcp -2 [SE to client]
- compare checksum original vs. copied back
- srmrm
Testcase testV1Copy:
- srmcp -1 [client to SE]
- srmcp -1 [SE to client]
- compare checksum original vs. copied back
- srmrm
Testcase testv2CopyBadChecksum:
- srmcp -1 -retry_num=1 [test for failure client to SE]
- somehome a false checksum is transmitted by using /proc/uptime
- srmrm
Testcase testv2CopyBadChecksumMD5
- srmcp -1 -retry_num=1 [test for failure client to SE]
- false checksum is transmitted by using /proc/uptime
- srmrm
Testcase testV2CopyDirNotExist:
- srmcp -2 [client to SE]
- srmcp -2 [SE to client]
- compare checksum original vs. copied back
Testcase testSrmLsValidPAth:
Testcase testSrmLsInValidPAth:
- srmls [test for failure with invalid path]
Testcase testSrmChangePerm
- srmmkdir [remote url]
- srm-set-permissions
Testcase testSrmmvIntoSame:
- srmcp [localUrl to remoteSourceUrl]
- srmmv [remoteSourceUrl to remoteTargetUrl]
LCGcp
Testcase testLcgCp:
- lcg-cp [srmv2, client to SE]
- lcg-cp [srmv2, SE to client]
- compare checksum original vs. copied back
- srmrm
Testcase testLcgCpIntoNonExistDir
- lcg-cp [srmv2, client to SE]
Testcase testLcgGtGsiFtp
- lcg-cp [srmv2, client to SE]
- lcg-gt [get transfer URL of remoteURL]
- srmrm
Testcase testLcgLsFile
- lcg-cp [srmv2, client to SE]
- lcg-ls [remoteURL of file]
- srmrm
Testcase testLcgLsDir
- lcg-ls [remoteURL of directory]
gsiFTP
Testcase testNoDCAU
- globus-url-copy -nodcau [client to SE, no data channel authentication - Turns off data channel authentication for FTP transfers]
Testcase testGsiftpSingleStream
- globus-url-copy -p 1 [client to SE]
- globus-url-copy -p 1 [SE to client]
- compare checksum original vs. copied back
Testcase testGsiftpMultipleStreams
- globus-url-copy -p 10 [client to SE]
- globus-url-copy -p 10 [SE to client]
- compare checksum original vs. copied back
Testcase testLsOfNonPnfsPath
- edg-gridftp-ls [test for failure of non pNFS directory: 'gsiftp://.../root]
Testcase testLsOfTestBase
- edg-gridftp-ls [normal pNFS directory: 'gsiftp://%s/%s]
Testcase testLsNonExistingPath
- edg-gridftp-ls [test for failure of none existing path]
DCAP
Testcase testGsiDccp
- dccp [client to SE]
- dccp [SE to client]
Testcase testPrestageOnDir
- dccp -P [prestaging of a directory]
serviceports.py
Testing if the relevant ports are reachable.
Spacemanager
Testcase testGetSpaceTokens
- srm-get-space-tokens [surl]
Testcase testPutRemoved - This test is done to see whether the space manager has deleted the file from the database.
- srmcp -2 [client to SE (remoteURL)]
- srmrm [delete remoteURL]
- srmcp -2 -retry_num=1 [client to SE (remoteURL)]
Authentication/Authorization
- voms-proxy-init --voms desy:/desy/Role=production
- srm-get-space-tokens self.surlBase [token1]
- voms-proxy-init
- srm-get-space-tokens self.surlBase [token2]
- compare token1 and token2 [failure if equal]
Pass/Fail Criteria
Testing a release fails when one of the tests -G1, G2, S2 - fail. The release manager tests pass if all tests returned the correct value (pass/fail) since there are some tests checking wrong behaviors.
Error workflow - erroneous input
Some tests are done checking undesired behavior. All of the tests are gathered above and error tests are marked as 'test for failure '
Pass/Fail Criteria
These tests fail when the erroneous input does not result in a failing test.
'Feature Summary'
Description and explanation for not being included in the current test plan
kerberosFTP not tested
weakFTP not tested
xrootd
Performance tests
Describe the strategy to implement performance tests, then main system variables to control and the values to use.
Scalability tests
Describe the strategy to implement scalability tests, what are the variables that should be studied and the values to use.
--
ChristianBernardt - 10-Feb-2011