Miscellaneous Tasks

Closing Blocks

To close blocks we can use the phedex::Web::API::Inject. We are not injecting blocks or files, we are only using the API to set the desire state of the block, which is closed in this case

Step 1 Generate the .xml file with the blocks to close. The file should contain the XML structure representing the data to be closed.

  • See requirements of the XML file at phedex::Web::API::Inject
  • XML should looks like:
    • <data version="2.0">
          <dbs dls="dbs" name="https://cmsweb.cern.ch/dbs/prod/global/DBSReader">
              <dataset is-open="y" name="dataset_name_1">
                  <block is-open="n" name="block_name_1"></block>
                  <block is-open="n" name="block_name_2"></block>
              </dataset>
              <dataset is-open="y" name="dataset_name_2">
                  <block is-open="n" name="block_name_3"></block>
                  <block is-open="n" name="block_name_4"></block>
              </dataset>
              ...
              <dataset is-open="y" name="dataset_name_N">
                  <block is-open="n" name="block_name_(n-1)"></block>
                  <block is-open="n" name="block_name_n"></block>
              </dataset>
          </dbs>
      </data>
  • Note: Please set the parameter is-open to "n" in all the blocks

Step 2 Use PhEDEx datasvc inject API to close the blocks

  • Command:
    • source PHEDEX/etc/profile.d/env.sh
      PHEDEX/Utilities/phedex --instance prod inject --datafile blocks_to_close.xml --node TX_XX_XXXXX_XXX --strict
    • For the argument node, just select whatever node with replica of the block. It does not make any difference, since we are not injecting new blocks or files

List of LFNs by Site

Sometimes, for accounting or other purposes, we want a list of all files at a given site. We can do it via PhEDEx APIs or querying directly TMDB. Both methods are explained here.

  • Via PhEDEx APIs
    • Get all blocks at the site
      • python ~/TransferTeam/commons/datasvc.py --service blockreplicas --path /phedex/block:name --options node=T1_US_FNAL_Disk &gt; FNAL_blocks.list
    • Get all files and its replicas
      • awk '{system("python ~/TransferTeam/commons/datasvc.py --service filereplicas --path /phedex/block/file:name/replica:node --options block="$1)}' FNAL_blocks.list > files_replicas.list

  • Via TMDB
    • We are going to use PHEDEX::Core::SQL::getSiteReplicasByName
      • https://github.com/dmwm/PHEDEX/blob/master/perl_lib/PHEDEX/Core/SQL.pm#L572-L589
      • If you look at the query, you will see that there is a join on t_dps_block_replica or t_xfer_replica tables. Files that are active for transfer are in t_xfer_replica table, while files in blocks that are inactive (no transfers) are not in the t_xfer_replica table. Only the blocks are in t_dps_block_replica table. so you need to join both and combine. For blocks not in transfer, phEDEx doesn't keep track of the location of each individual file, just the location of the blocks.
    • Run the script getSiteReplicasByName
      • It is located at /afs/cern.ch/user/j/jodiazcr/public/getSiteReplicasByName
      • ./getSiteReplicasByName > outputfile.list

List Missing Files Without Replica

The script below helps in the identification of files that are missing and doesn't have replicas in storage elements. Currently the script receives and input file that must be located in the same folder where the script is located, and must be named 'StuckDatasets.txt' and correspond to the following format:

Input File Format

# -- 2015-10-08 06:30
#
#- DDM Partition: AnalysisOps -
#
#------------------------------------
# Rank TrueSize DiskSize nsites DatasetName
# [days] [GB] [GB]
x x x x/<DATASET>
x x x x /<DATASET>

Location

/afs/cern.ch/user/j/jpulidom/public/missingFiles.py

Paramenters

  • -s Site/Node
  • -d Data Tier
Example
  • python ~/TransferTeam/scripts/missingFiles.py -s TX_XX_XXXX -d USER

Manually Deactivate a Block

Block activation in PhEDEx refers to a combination of multiple operations on TMDB. When a block is activated,

  • One entry per logical file is made in table t_xfer_file
  • One entry per file replica is made in table t_xfer_replica
  • Column is_active is set to 'y' for replicas of the block in t_dps_block_replica
Three operations are performed in succession in the BlockActivate central agent. There is also a BlockDeactivate agent that finds blocks that can be deactivated (active for more than 3 days and are not in the deletion queue or activation queue) and performs the reverse of the above operations on them.

For some unknown reason, we sometimes find stray entries in t_xfer_file for blocks where is_active is 'n' for all replicas. BlockDeactivate agent does not detect such stray entries, but BlockActivate agent fails in its sanity check where the following equation must hold:
(number of entries made in t_xfer_file) * (number of block replicas) = (number of entries made in t_xfer_replica)
When this happens, the block will neither be activated nor deactivated.

It is easy to spot the blocks with this problem by looking at the warning messages in the BlockActivate agent log (/data/ProdNodes/Prod_Mgmt/logs/mgmt-blockactiv). There are however other symptoms that point to the block activation inconsistency, such as

  • blockreplica and filereplica APIs disagree: While blockreplica says the replica is incomplete, filereplica shows all files as present at the site.
  • Cannot invalidate files: Either the block activation step never completes, or FileDeleteTMDB says the file does not exist, even though filereplicas call says it does.
  • Block deletions stays in pending: Blocks stay in the deletions table (https://cmsweb.cern.ch/phedex/prod/Activity::Deletions) forever.

To free the blocks, we need to manually deactivate them. This is an operation that writes into TMDB directly and therefore must be executed extremely carefully.

Step 1 Identify the blocks to deactivate.

Log in to the central agent machine and copy the names of blocks with inconsistencies from the latest cycle of the BlockActivation agent

  ssh vocms0214.cern.ch
  sed -n '/2018-08-07 14:58:40: BlockActivate\[17445\]: Creating/,/debug/p' /data/ProdNodes/Prod_Mgmt/logs/mgmt-blockactiv > blockactiv.log
  # Need to replace the timestamp
  # Need to edit blockactiv.log to be a simple list of block names (one per line)

Step 2 Stop the blockactiv and blockdeact agents

  ssh phedex@vocms0214.cern.ch
  cd /data/ProdNodes
  PHEDEX/Utilities/Master -config SITECONF/CH_CERN/PhEDEx/Config.Mgmt.Prod stop mgmt-blockactiv mgmt-blockdeact

Step 3 Deactivate the blocks

  sqlplus $(OracleConnectId -db ~/TransferTeam/phedex/DBParam:Prod/OPSIIYAMA) @deactivate_block.sql block_name
where deactivate_block.sql is
set role phedex_ops-your-account_prod identified by -password-written-in-dbparam-
delete from t_xfer_replica where fileid in (select id from t_xfer_file where inblock = (select id from t_dps_block where name = ':1'));
delete from t_xfer_file where inblock = (select id from t_dps_block where name = ':1');
update t_dps_block_replica set is_active = 'n', time_update = ((sysdate - to_date('01-JAN-1970','DD-MON-YYYY')) * (86400)) where block = (select id from t_dps_block where name = ':1');
quit
/

Step 4 Start the blockactiv and blockdeact agents

  ssh phedex@vocms0214.cern.ch
  cd /data/ProdNodes
  PHEDEX/Utilities/Master -config SITECONF/CH_CERN/PhEDEx/Config.Mgmt.Prod start mgmt-blockactiv mgmt-blockdeact

Check the blockactiv logs to make sure everything worked.

-- JuanPulidoMojica - 2016-05-30

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2018-08-07 - YutaroIiyama
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback