Opening and reading test

The purpose

The aim of these tests is to evaluate the performance of a site in opening and reading files through the regional xrootd redirector before moving it to the storage production federation (AAA) and to check weekly the site readiness The target of opening test is reached with a minimum opening rate of 10 Hz and a number of simultaneous clients larger than 90. The target of reading test is reached with a reading rate of 150 MB/s and a number of simultaneous client equal to 600.

Weekly tests run via cron using a dedicated condor pool located in Wisconsin. The code for the opening and reading tests were implemented by C.Vuosalo (https://github.com/cvuosalo/xrootd_scaletest). Some bash wrappers were written to create a test suite that should simplify job creation, submission and the creation of final plots. Now there is also a dedicated pool at CERN, though the number of nodes is not sufficient to run 600 jobs at the same time. Here the test suite is available and under test.

Prerequisites

A site is ready to be tested only if it has defined the regular expression for the "xrootd test path" "/store/test/xrootd/site_name/store/" in the site’s trivial file catalog.

vocms0101.cern.ch is the machine where the test suite is installed under the directory /scratch/FOR_AAA_TEST_AT_CERN. This is also the submitter node for the condor pool. Here the operator should have a valid proxy. For each site the operator has to know: the site_name, the phedex_site_name and the evaluation of Round Trip Time (RTT can be determined with the ping command). The list of input files stored at the site is obtained via the Phedex API.

How to Start

In the vocms0101.cern.ch under the directory /scratch/FOR_AAA_TEST_AT_CERN/ there are all the necessary scripts to run the tests:

automatic_open_run_manual.sh
automatic_read_run_manual.sh

RTT evaluation

The IP address of the xrootd storage server of a site can be discovered using the command

> xrd <redirector> locateall <xrootd test path>
[vocms0101] xrd cms-xrd-transit.cern.ch locateall /store/test/xrootd/T2_RU_ITEP

------------- Location #1
InfoType: kXrdcLocDataServer
CanWrite: false
Location: '194.85.69.169:1095'

Then ping the remote server

[vocms0101] ping 194.85.69.169
PING 194.85.69.169 (194.85.69.169) 56(84) bytes of data.
64 bytes from 194.85.69.169: icmp_seq=1 ttl=58 time=76.6 ms
64 bytes from 194.85.69.169: icmp_seq=2 ttl=58 time=76.7 ms
64 bytes from 194.85.69.169: icmp_seq=3 ttl=58 time=76.6 ms
64 bytes from 194.85.69.169: icmp_seq=4 ttl=58 time=77.2 ms
64 bytes from 194.85.69.169: icmp_seq=5 ttl=58 time=76.8 ms
64 bytes from 194.85.69.169: icmp_seq=6 ttl=58 time=78.0 ms
^C
--- 194.85.69.169 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5089ms
rtt min/avg/max/mdev = 76.688/77.039/78.053/0.542 ms

takes the average time (converted in seconds)

Input list of files

The list of input files for a site is created using the script getsitefilelist_new_fede.py

> python getsitefilelist_new_fede.py <phedex_site_name >> <site_name>.files
or using the scripts get_sort.sh
get_sort_case.sh that create the input list for all the sites provided in a list These scripts are in vocms0101 under the directory /scratch/FOR_AAA_TEST_AT_CERN and /scratch/FOR_AAA_TEST_AT_CERN/INPUT_FILES_FOR_TEST/

Details about scripts

Opening tests

How to run

/scratch/FOR_AAA_TEST_AT_CERN/automatic_open_run_manual.sh  input_open.txt

takes as input the file containing the list of sites to test (input_open.txt)

The input file contains the following info: cms_site_name phedex_site_name redirector RTT i.e

T1_UK_RAL  T1_UK_RAL_Disk xrootd-cms.infn.it 0.12
T2_ES_IFCA T2_ES_IFCA xrootd-cms.infn.it 0.15

Then this script

  • sets the environment for condor
  • does a loop reading the input list and for each line:
    • obtains the cms_site_name, phedex_site_name, redirector_name, RTT
    • if RTT isn't defined, a default value is used (0.15 for EU sites, 0.05 for US site)
    • creates subdirs where to store results (${cms_site_name}-${redirector}_${date})
    • calls the scripts /scratch/FOR_AAA_TEST_AT_CERN/script_run_open_test.sh $cms_site_name $phedex_site_name $redirector
    • when the script_run_open_test.sh ends (opening-test jobs have been created and submitted to condor queue) there is a sleep for 120 sec and then it checks the status of condor every 300 sec.
    • find the schedd ip
    • when the total number of condor jobs have reached the maximum, the input_read file (input for the read tests) is created.
      It contains the following info: directory where to find results, cms_site_name, number of ntuple to read (000 by default), redirector name and ping value i.e
      T2_US_Caltech-xrootd-cms.infn.it_21_06_16 T2_US_Caltech 000 xrootd-cms.infn.it 0.05
      T2_US_MIT-xrootd-cms.infn.it_21_06_16 T2_US_MIT 000 xrootd-cms.infn.it 0.05
      T2_US_Nebraska-xrootd-cms.infn.it_21_06_16 T2_US_Nebraska 000 xrootd-cms.infn.it 0.05
    • calls the /scratch/FOR_AAA_TEST_AT_CERN/script_make_open_plots.sh $dir_name $cms_site_name $redirector that creates related plots

The script /scratch/FOR_AAA_TEST_AT_CERN/script_run_open_test.sh takes as inputs $cms_site_name $phedex_site_name $redirector

  • copies from dir /scratch/FOR_AAA_TEST_AT_CERN/INPUT_FILES_FOR_TESTS the list of files stored in the site (list created with getsitefilelist_new_fede.py script)
  • calls the /afs/hep.wisc.edu/cms/sw/AAA/scaletest/make_jobsXrdClwrapper to create 200 opening jobs
    ${make_open_jobs} 200 ${file_list_name} root://${redirector_name}//store/test/xrootd/${cms_site_name}
  • submit jobs using ramp-up script
    /scratch/FOR_AAA_TEST_AT_CERN/ramp_up_jobs 10 120 100 >& ramp.log < /dev/null & (up to 100 jobs, ten each 120 sec)

The script /scratch/FOR_AAA_TEST_AT_CERN/script_make_open_plots.sh. it takes as inputs $cms_site_name $phedex_site_name $redirector

  • gets some "numbers" for statistics using /scratch/FOR_AAA_TEST_AT_CERN/numbers.sh script
  • calls the make_stat_plots_testfede.py to create plots
    python /scratch/FOR_AAA_TEST_AT_CERN/make_stat_plots_testfede.py test_files_${dir_name}.txt $length_test 120 $start_test $cms_site_name This script provide some info about test results.
    The line "OPENING TEST RESULT ..." is used to create summary file for the html page.

Reading tests

How to run

/scratch/FOR_AAA_TEST_AT_CERN/automatic_read_run_manual.sh input_read
takes as input the file containing info about directory where to find opening test results, cms_site_name, the number of the ntuple to read (000 by default), redirector name and RTT value. i.e
T2_US_Caltech-xrootd-cms.infn.it_21_06_16 T2_US_Caltech 000 xrootd-cms.infn.it 0.05
T2_US_MIT-xrootd-cms.infn.it_21_06_16 T2_US_MIT 000 xrootd-cms.infn.it 0.05
T2_US_Nebraska-xrootd-cms.infn.it_21_06_16 T2_US_Nebraska 000 xrootd-cms.infn.it 0.05

Then this script

  • sets the environment for condor
  • does a loop reading the input file and for each line:
    • takes the dir_name, cms_site_name, root_number, redirector_name, ping_value
    • calls the script /scratch/FOR_AAA_TEST_AT_CERN/script_run_read_test.sh $dir_name $cms_site_name $root_number $redirector to create and submit reading-test jobs
    • when the script_run_read_test.sh ends, sleeps for 240 sec and checks the status of condor every 600 sec.
    • find the schedd ip
    • calls the /scratch/FOR_AAA_TEST_AT_CERN/script_make_read_plots.sh $dir_name $cms_site_name $ping_value $root_number $redirector that creates related plots

The script /scratch/FOR_AAA_TEST_AT_CERN/script_run_read_test.sh takes as inputs $dir_name $cms_site_name $root_number $redirector

  • splits the total list of file stored in a site in short list containing 1000 files and it uses the "000" list as input for reading test
  • call the /afs/hep.wisc.edu/cms/sw/AAA/scaletest/mkrdjobXrdClwrapper.sh in order to create 999 jobs ${make_run_jobs} 999 ${dir_name}_readfiles${file_number} root://$redirector_name//store/test/xrootd/${cms_site_name}
  • submit jobs using the ramp-up script /scratch/FOR_AAA_TEST_AT_CERN/ramp_up_jobs 50 180 800 >& ramp.log < /dev/null & (up to 800 jobs, 50 each 180 sec)

The script /scratch/FOR_AAA_TEST_AT_CERN/script_make_read_plots.sh takes as inputs $dir_name $cms_site_name $ping_value $root_number $redirector

  • gets the length of test using the script /scratch/FOR_AAA_TEST_AT_CERN/get_times
  • calls the readjobsplot_testfede.py to create plots python /scratch/FOR_AAA_TEST_AT_CERN/readjobsplot_testfede.py ${dir_name}_readfiles${root_number}_stdoutlist $length_test 180 $start_test $ping_value $cms_site_name 4 This script provides some info about test results. The line "READING TEST RESULT ... " is used to create summary file for the html page

Web Page

In vocms0101.cern.ch

/scratch/fanzago/for_weekly_report.sh <string_date_range> GREP
takes as input the range of date when tests run and "GREP" string i.e
 
17_01_10-17_01_14 GREP

  • calls the script /scratch/FOR_AAA_TEST_AT_CERN/parse.sh that
    • parses all the outputs files of opening and reading tests grepping the "RESULT" string and creates a file containing only these lines i.e
             OPENING TEST RESULT PROBLEM max obtained rate lower than 10 Hz  4.04020943279 T1_UK_RAL-ba_21_06_15
             OPENING TEST RESULT OK  T1_UK_RAL-xrootd-cms-ext.gridpp.rl.ac.uk_21_06_15
             OPENING TEST RESULT OK  T1_ES_PIC-ba_21_06_15
             READING TEST RESULT PROBLEM max obtained totalreadrate lower than 150 MB and max_clients < 600  9.25 37.0 T1_UK_RAL-ba_21_06_15
             READING TEST RESULT OK  T1_UK_RAL-xrootd-cms-ext.gridpp.rl.ac.uk_21_06_15
             READING TEST RESULT OK  T1_ES_PIC-ba_21_06_15
             READING TEST RESULT OK  T1_DE_KIT-ba_21_06_15
             
    • creates other "filtered" files as after_sort.txt i.e
              T1_DE_KIT-xrootd-cms.infn.it_14_06_16 OPENING OK
              T1_DE_KIT-xrootd-cms.infn.it_14_06_16 READING PROBLEM
              T1_ES_PIC-xrootd-cms.infn.it_14_06_16 OPENING WARNING
              T1_ES_PIC-xrootd-cms.infn.it_14_06_16 READING OK
              
    • and result_compact.txt
              T1_DE_KIT-ba_02_08_15 OPENING WARNING READING OK
              T1_ES_PIC-ba_02_08_15 OPENING WARNING READING OK
              

  • calls the script /scratch/FOR_AAA_TEST_AT_CERN/parse_html.sh that creates the html file for the week -index-summarytest.html It takes as input the range of date and uses the result_compact file produced by parse.sh

In vocms037.cern.ch, that is the web server

/var/www/html/AAA/for_weekly.sh <days> <month> <year> <string_date_range>
takes as input the range of date when tests, the month, the year and the complete string i.e
"14 15 16 17" 06 16 16_06_14-16_06_17 2>&1

Then this script

  • copies from vocms0101 the produced plots
    /var/www/html/AAA/copy_script.sh "$1" $2 $3
  • copies from vocms0101 the summary for html
    scp -i /root/.ssh/id_rsa_ff vocms0101.cern.ch://scratch/FOR_AAA_TEST_AT_CERN/SUMMARY/${4}-index-summarytest.html ${summary_html_dir}
  • copies index for view in dashboard
    scp -i /root/.ssh/id_rsa_ff vocms0101.cern.ch://scratch/FOR_AAA_TEST_AT_CERN/SUMMARY/${4}-totalout.txt ${summary_dir}
  • creates the html page
    /var/www/html/AAA/create_html_summary.sh

The copy_script.sh

  • copies from vocms0101 all the plots related to days provided as input
  • creates directory as sitename_redirector_date
  • create_site_data_html $day $site (site contains also the redirector name)
  • create_site_html ${sites_name}

The create_html_summary.sh

  • creates index_summary starting from files in the /scratch/FOR_AAA_TEST_AT_CERN/SUMMARY/ dir

P.S. the way to create the html page is under evaluation.

Web page visualization

From terminal:

$ ssh -D 9999 -C <user_login>@lxplus.cern.ch

In Firefox:

Firefox> Edit> Preferences> Advanced tab> Network tab> Settings button.
Select Manual proxy configuration
SOCKS Host: localhost Port: 9999
SOCKS v5
No Proxy for: localhost, 127.0.0.1
Then go to 'http://vocms037.cern.ch/AAA/

Git code

Testsuite on git
Opening and reading test code

-- FedericaFanzago - 2017-01-31

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2018-03-02 - PradeepJasal
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback