dCache - Software Verification and Validation Plan

Service/Component Description

Service Reference Card - dCache Server

Deployment scenarios

The following deployment scenario depicts the recommended way of deploying dCache.

deploymentscenario dCache.png
Configuration dCache SE DESY

However, if the instance has a large number of pools it is advisable to start the gridFTP and dCap doors on the pools. In case of transfer through doors the ratio pools:door should be 1:1 for each door (dcap, gridFTP).

Functionality tests

Pre-Release Testing:

  • When code is committed commit hooks automatically build dCache, dcap and srm client. On error an email is sent to the developer who made the last commit.
  • With the last successful build the yum repositories are updated and the rpms automatically installed on some machines.
  • Before dCache server rpms are put on our webpage we do an early roleout and are testing the release in the NDGF production system.

In order to test the newly committed code Hudson is used for testing on machines having the latest dCache test release installed. The Test Suite contains:

  • Integration Tests: G1 tests, G2 functional tests (lcg-tools)(srmcp, lcg-cp, gsiftp, dcap, gsidcap), availability of serviceports, spacemanager
  • S2 tests
  • Parallel S2 tests: The parallel S2 (basic and usecase suites) tests run on 7 clients and one server every hour for two days if it is a patch release and for 14 days if it is a minor release.

Features/Scenarios to be tested

emi-1 dCache features: can be found here JRA1.1 - EMI-1 Development and Test Plans

dCache Features:

supported protocols:

Protocol Functional Test
NFSv4.1 (pNFS) CTHON 04, PY-Test
SRM 2.2 G1,G2, S2
WebDav (http, https) not tested
dcap, gsidcap G1, G2
gsiFTP G1, G2
kerberosFTP not tested
weakFTP not tested
xrootd not tested

other features, which are partially implicitly tested:

  • Chimera, PNFS
  • Pin manager
  • Space manager
    • srm-get-space-tokens
    • srmcp, srmrm, srmcp space tokens?
  • Pool to pool migration
  • GLUE info provider
  • ACLs
  • Webadmin interface
  • Command line admin interface
  • Resilience with the Replica Manager
  • gPlazma authentication
  • Stage protection
  • Tape protection

'Feature Summary' (implemented/not implemented)

  Feature What it does Links
implemented NFSv4.1 (pNFS) server Unified namespace allowing striping and separating data and metadata.  
implemented SRM server Server provides access to SRM clients.  
implemented WebDav (http, https) WebDAV  
implemented dcap, gsidcap File operation via dcap door  
implemented gsiFTP File operation via gridFTP door  
implemented kerberosFTP File operation via kerberosFTP door  
implemented weakFTP File operation via FTP door  
implemented xrootd File operation via xrootd door  
implemented Chimera, PNFS Chimera is the recommended database driven namespace provider for providing a single rooted file system view. Operation ls, mkdir, mv, cp, cat are possible. PNFS is the hash table based approach, but does not permit I/O operations  
implicit in G2 Pin manager Preventing files from being deleted from pools when they run out of space  
implicit in G2 Space manager srm-get-space-tokens, srmcp, srmrm, srmcp space tokens?  
implicit in G2 Pool to pool migration Migration of classic SE ( nfs, disk ) to dCache  
implicit in G2 GLUE info provider Publishing resource information for site level and top level BDIIs. GLUE info provider
not implemented ACLs Permission handler, NFSv4 ACL setting user or group permissions for file-operations. dCache book - ACLs in dCache
not implemented Webadmin interface Replacement for the httpd service providing site information on a webpage plus some admin functionality dCache Webadmin-Interface
not implemented Command line admin interface Access dCache site functionality from command line. dCache book - The Admin Interface
implicit in G2 Resilience with the Replica Manager    
implicit in G2 gPlazma authentication Authentication, Authorization, Mapping (DN <--> UID/GID), Blacklisting  
not implemented Stage protection Protective mechanism only allowing production users (user groups) to trigger a tape operation, which prevents very expensive tape reads.  

Normal workflow - correct input

Hudson tests:

G1: Test_Suite_Functional_CTB_cork

dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_srm

  • srmcp [client to SE (http)]
  • srm-get-metadata (from SE)
  • srmcp [SE to client], comparing Md5sums original and file gotten
  • srm-advisory-delete

dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_GlobusUrlCopyGridftp

  • edg-gridftp-exists (on file that was definitely deleted)
  • globus-url-copy [client to SE, gsiFTP]
  • edg-gridftp-exists
  • globus-url-copy [SE to client, gsiFTP]
  • edg-gridftp-rm
  • edg-gridftp-exists
  • checking Md5sums original and file copied back

dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_GlobusUrlCopyStreams (parallel streams = 1, 5, 9, 13, 17, 21)

  • edg-gridftp-exists (test_GlobusUrlCopy.lucan.502.12326.20110211044446.0 does not exist)
  • globus-url-copy -p 1 [client to SE, gsiFTP]
  • edg-gridftp-exists
  • globus-url-copy [SE to client, gsiFTP]
  • edg-gridftp-rm
  • edg-gridftp-exists
  • check Md5sums original file copied back

dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_edgGridftpLsRoot

  • globus-url-copy [client to SE, gsiFTP]
  • globus-url-copy -nodcau (-no-data-channel-authentication) [SE to client, gsiFTP]
  • edg-gridftp-rm
  • checking Md5sums original and file copied back

dCacheTestSuite.py -T cork.desy.de -d desy.de -v dteam -q 0 -r test_srmMkdir

  • srmcp -webservice_protocol=http [client to SE, gsiFTP]
  • srmcp -webservice_protocol=http [SE to client, gsiFTP]
  • checking Md5sums original and file copied back
  • srmrm
  • srmrmdir

G2:

  • Executing DM-lcg-alias.sh
  • Executing DM-lcg-cp-gsiftp.sh
  • Executing DM-lcg-cp.sh
  • Executing DM-lcg-cr-gsiftp.sh
  • Executing DM-lcg-cr.sh
  • Executing DM-lcg-list.sh
  • Executing DM-lcg-ls.sh
  • Executing DM-lcg-rf.sh
  • Executing DM-lcg-rep.sh
  • Executing DM-lcg-rep.sh
  • Executing DM-lcg-get-checksum.sh

S2:

  • Executing SRMv2-get-SURLs
  • Executing SRMv2-ls-dir
  • Executing SRMv2-put
  • Executing SRMv2-ls
  • Executing SRMv2-gt
  • Executing SRMv2-get
  • Executing SRMv2-del

If the executed command return successfully the tests are passed.

More tests, partially redundant, are done by the release manager:

SRMcp


Testcase testV2Copy:
  • srmcp -2 [client to SE]
  • srmcp -2 [SE to client]
  • compare checksum original vs. copied back
  • srmrm
Testcase testV2CopyMD5:
  • srmcp -2 -cksm_type=MD5[client to SE] with MD5 checksum
  • srmcp -2 [SE to client]
  • compare checksum original vs. copied back
  • srmrm
Testcase testV1Copy:
  • srmcp -1 [client to SE]
  • srmcp -1 [SE to client]
  • compare checksum original vs. copied back
  • srmrm
Testcase testv2CopyBadChecksum:
  • srmcp -1 -retry_num=1 [test for failure client to SE]
  • somehome a false checksum is transmitted by using /proc/uptime
  • srmrm
Testcase testv2CopyBadChecksumMD5
  • srmcp -1 -retry_num=1 [test for failure client to SE]
  • false checksum is transmitted by using /proc/uptime
  • srmrm
Testcase testV2CopyDirNotExist:
  • srmcp -2 [client to SE]
  • srmcp -2 [SE to client]
  • compare checksum original vs. copied back
Testcase testSrmLsValidPAth:
  • srmls [with valid path]
Testcase testSrmLsInValidPAth:
  • srmls [test for failure with invalid path]
Testcase testSrmChangePerm
  • srmmkdir [remote url]
  • srm-set-permissions
Testcase testSrmmvIntoSame:
  • srmcp [localUrl to remoteSourceUrl]
  • srmmv [remoteSourceUrl to remoteTargetUrl]

LCGcp


Testcase testLcgCp:
  • lcg-cp [srmv2, client to SE]
  • lcg-cp [srmv2, SE to client]
  • compare checksum original vs. copied back
  • srmrm
Testcase testLcgCpIntoNonExistDir
  • lcg-cp [srmv2, client to SE]
Testcase testLcgGtGsiFtp
  • lcg-cp [srmv2, client to SE]
  • lcg-gt [get transfer URL of remoteURL]
  • srmrm
Testcase testLcgLsFile
  • lcg-cp [srmv2, client to SE]
  • lcg-ls [remoteURL of file]
  • srmrm
Testcase testLcgLsDir
  • lcg-ls [remoteURL of directory]

gsiFTP


Testcase testNoDCAU
  • globus-url-copy -nodcau [client to SE, no data channel authentication - Turns off data channel authentication for FTP transfers]
Testcase testGsiftpSingleStream
  • globus-url-copy -p 1 [client to SE]
  • globus-url-copy -p 1 [SE to client]
  • compare checksum original vs. copied back
Testcase testGsiftpMultipleStreams
  • globus-url-copy -p 10 [client to SE]
  • globus-url-copy -p 10 [SE to client]
  • compare checksum original vs. copied back
Testcase testLsOfNonPnfsPath
  • edg-gridftp-ls [test for failure of non pNFS directory: 'gsiftp://.../root]
Testcase testLsOfTestBase
  • edg-gridftp-ls [normal pNFS directory: 'gsiftp://%s/%s]
Testcase testLsNonExistingPath
  • edg-gridftp-ls [test for failure of none existing path]

DCAP


Testcase testGsiDccp
  • dccp [client to SE]
  • dccp [SE to client]
Testcase testPrestageOnDir
  • dccp -P [prestaging of a directory]

serviceports.py


Testing if the relevant ports are reachable.

Spacemanager


Testcase testGetSpaceTokens
  • srm-get-space-tokens [surl]

Testcase testPutRemoved - This test is done to see whether the space manager has deleted the file from the database.

  • srmcp -2 [client to SE (remoteURL)]
  • srmrm [delete remoteURL]
  • srmcp -2 -retry_num=1 [client to SE (remoteURL)]

Authentication/Authorization


  • voms-proxy-init --voms desy:/desy/Role=production
  • srm-get-space-tokens self.surlBase [token1]
  • voms-proxy-init
  • srm-get-space-tokens self.surlBase [token2]
  • compare token1 and token2 [failure if equal]

Pass/Fail Criteria
Testing a release fails when one of the tests -G1, G2, S2 - fail. The release manager tests pass if all tests returned the correct value (pass/fail) since there are some tests checking wrong behaviors.

Error workflow - erroneous input
Some tests are done checking undesired behavior. All of the tests are gathered above and error tests are marked as 'test for failure '

Pass/Fail Criteria
These tests fail when the erroneous input does not result in a failing test.

'Feature Summary'

Description and explanation for not being included in the current test plan kerberosFTP not tested weakFTP not tested xrootd

Performance tests

Describe the strategy to implement performance tests, then main system variables to control and the values to use.

Scalability tests

Describe the strategy to implement scalability tests, what are the variables that should be studied and the values to use.

-- ChristianBernardt - 10-Feb-2011

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2012-02-05 - DoinaCristinaAiftimiei
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback