cmssrv219

Jobs in Condor

[cmsdataops@cmssrv219 current]$ condorq
[cmsdataops@cmssrv219 current]$ 

Jobs ordered by status

SQL> select wmbs_job_state.name, count(*)
from wmbs_job
join wmbs_job_state on (wmbs_job.state = wmbs_job_state.id)
group by wmbs_job.state, wmbs_job_state.name;
+----------+----------+
| name     | count(*) |
+----------+----------+
| cleanout |    45030 |
+----------+----------+
1 row in set (0.04 sec)

Workflows in the System

MariaDB [wmagent]> SELECT DISTINCT name from wmbs_workflow;
+-------------------------------------------------------------------------+
| name                                                                    |
+-------------------------------------------------------------------------+
| fabozzi_HIRun2015-HIMinimumBias5-02May2016_758p4_160502_172625_4322     |
| fabozzi_Run2015D-DoubleEG-08Jun2016_765p1_160608_195144_7236            |
| pdmvserv_BPH-RunIISummer15GS-00069_00377_v0__160616_201424_8871         |
| pdmvserv_BPH-Summer12-00200_00262_v0__160617_004008_1356                |
| pdmvserv_BPH-Summer12-00203_00262_v0__160617_004010_4907                |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00189_00075_v0__160621_163932_3627    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00200_00075_v0__160621_164221_3500    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00211_00075_v0__160621_164435_3761    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00294_00077_v0__160621_165607_8671    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00387_00072_v0__160621_151058_9114    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00401_00072_v0__160621_151547_4748    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00416_00073_v0__160621_152052_5283    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00471_00079_v0__160621_171728_7928    |
| pdmvserv_EXO-RunIISummer15wmLHEGS-00488_00080_v0__160621_172304_9481    |
| pdmvserv_SMP-HINppWinter16DR-00010_00033_v0__160615_152226_6509         |
| pdmvserv_task_B2G-RunIISpring16DR80-01031__v1_T_160615_122818_2847      |
| pdmvserv_task_EXO-RunIIFall15DR76-02715__v1_T_160615_151053_6520        |
| pdmvserv_task_HIG-RunIISpring16DR80-01235__v1_T_160615_142943_4709      |
| prozober_ACDC_task_HIG-RunIISpring16DR80-01050__v1_T_160609_111033_1808 |
| prozober_ACDC_task_HIG-RunIISpring16DR80-01050__v1_T_160609_111047_928  |
| prozober_ACDC_task_HIG-RunIISpring16DR80-01050__v1_T_160609_111054_6629 |
+-------------------------------------------------------------------------+
21 rows in set (0.00 sec)

and these workflows are at least in completed status.

Global Queue situation for these requests still running

[cmsdataops@cmssrv219 current]$ for req in `awk '{print $1}' alan`; do python getGQByWorkflow.py $req; echo ""; done
Summary for request pdmvserv_BPH-RunIISummer15GS-00069_00377_v0__160616_201424_8871 in the 'workqueue' database
{'Done': {'http://cmssrv219.fnal.gov:5984': 1},
 'Running': {'http://cmssrv217.fnal.gov:5984': 1},
 'numberOfElements': 2}

Workflows not fully injected

MariaDB [wmagent]> select distinct name from wmbs_workflow where injected = 0;     
Empty set (0.00 sec)

Subscriptions not finished

MariaDB [wmagent]> select distinct wmbs_workflow.name AS wfName
   FROM wmbs_subscription
   INNER JOIN wmbs_fileset ON wmbs_subscription.fileset = wmbs_fileset.id
   INNER JOIN wmbs_workflow ON wmbs_workflow.id = wmbs_subscription.workflow
   where wmbs_subscription.finished = 0 ORDER BY wmbs_workflow.name;
Empty set (0.00 sec)

Files available in WMBS (waiting for job creation)

SQL> SELECT wmbs_workflow.name, wmbs_sub_files_available.subscription, count(wmbs_sub_files_available.fileid)
  FROM wmbs_sub_files_available
  INNER JOIN wmbs_subscription ON wmbs_sub_files_available.subscription = wmbs_subscription.id
  INNER JOIN wmbs_workflow ON wmbs_subscription.workflow = wmbs_workflow.id
  GROUP BY wmbs_sub_files_available.subscription;
Empty set (0.00 sec)

Getting distinct workflow names with files available:

SQL> SELECT DISTINCT wmbs_workflow.name
  FROM wmbs_sub_files_available
  INNER JOIN wmbs_subscription ON wmbs_sub_files_available.subscription = wmbs_subscription.id
  INNER JOIN wmbs_workflow ON wmbs_subscription.workflow = wmbs_workflow.id
  GROUP BY wmbs_sub_files_available.subscription;
Empty set (0.00 sec)

Files acquired or acquired in WMBS (waiting for job to finish)

SQL> SELECT wmbs_workflow.name, wmbs_sub_files_acquired.subscription, count(wmbs_sub_files_acquired.fileid)
  FROM wmbs_sub_files_acquired
  INNER JOIN wmbs_subscription ON wmbs_sub_files_acquired.subscription = wmbs_subscription.id
  INNER JOIN wmbs_workflow ON wmbs_subscription.workflow = wmbs_workflow.id
  GROUP BY wmbs_sub_files_acquired.subscription;
Empty set (0.00 sec)

Getting distinct workflow names with files acquired:

SQL> SELECT DISTINCT wmbs_workflow.name
  FROM wmbs_sub_files_acquired
  INNER JOIN wmbs_subscription ON wmbs_sub_files_acquired.subscription = wmbs_subscription.id
  INNER JOIN wmbs_workflow ON wmbs_subscription.workflow = wmbs_workflow.id
  GROUP BY wmbs_sub_files_acquired.subscription;
Empty set (0.00 sec)

Files and Blocks in Phedex and DBS

Blocks open in DBS

SQL> SELECT * FROM dbsbuffer_block WHERE status!='Closed';
Empty set (0.02 sec)

Files not updated DBS

SQL> SELECT * from dbsbuffer_file where status = 'NOTUPLOADED';
Empty set (0.02 sec)

Files not injected in Phedex, with parent block id (can be recovered)

SQL> SELECT * FROM dbsbuffer_file
WHERE in_phedex=0
AND block_id IS NOT NULL
AND lfn NOT LIKE '%unmerged%'
AND lfn NOT LIKE 'MCFakeFile%'
AND lfn NOT LIKE '%BACKFILL%'
AND lfn NOT LIKE '/store/user%';
Empty set (0.24 sec)

Files not in phedex without parent block id (cannot be recovered) Possible input files.

SQL> SELECT count(*) FROM dbsbuffer_file
WHERE in_phedex=0
AND block_id IS NULL
AND lfn NOT LIKE '%unmerged%' 
AND lfn NOT LIKE 'MCFakeFile%'
AND lfn NOT LIKE '%BACKFILL%'
AND lfn NOT LIKE '/store/backfill%'
AND lfn NOT LIKE '/store/user%';
+----------+
| count(*) |
+----------+
|   139678 |
+----------+
1 row in set (0.26 sec)

So we run fix Phedex to update the files not in phedex

cmst1@vocms0309:/data/srv/wmagent/current $ curl https://raw.githubusercontent.com/amaltaro/ProductionTools/master/newFixPhEDEx.py > newFixPhedex.py
cmst1@vocms0309:/data/srv/wmagent/current $ python newFixPhedex.py 

Shutting down PhEDExInjector...
Checking 24 dataset in both PhEDEx and DBS ...
There are no files to be updated in the buffer. Contact a developer.
Starting PhEDExInjector now ...

started with pid 1093729

And we check afterwards

SQL> SELECT lfn FROM dbsbuffer_file
WHERE in_phedex=0
AND block_id IS NULL
AND lfn NOT LIKE '%unmerged%' 
AND lfn NOT LIKE 'MCFakeFile%'
AND lfn NOT LIKE '%BACKFILL%'
AND lfn NOT LIKE '/store/user%';

Edit | Attach | Watch | Print version | History: r32 | r30 < r29 < r28 < r27 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r28 - 2016-08-01 - AlanMalta
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback