https://indico.cern.ch/conferenceDisplay.py?confId=254677

Attending:

Personel:
Oct 22 --> Oct 29 Julian (+Adli)
Oct 29 --> Nov 4 Adli
  • Adlis shift (let's keep an eye on him)
    • By the way, thanks for the help during the week (Julian)
  • Edgar is on vacations until October 23
  • John is gone Oct 23-Nov 11
  • FNAL at normal operations, last week's crisis was averted :-).
Infrastructure
  • vomcs202 (reprocessing agent): if not works by tuesday, consider adding another agent to repro team.
    • Why FNAL is not getting jobs for reproc?
    • Glidin issues, doesn't appear to be completely solved at end of day here.
    • Aparently problem on condor side.
  • Vocms201 Keep an eye on disk space: periodically redeploy wmagent to clear space?
    • confirm that disk space is being properly cleared out after upgrades.
    • we need to find some view, that can be deleted.
    • log flushing?
  • WMAgent issues:
    • v0.9.82 deployed in vocms202 and vocms235. Next: vocms216, vocms234 and cmssrv98 are in drain (almost done).
    • vocms201 and vocms216 full disk at /data1/
    • 2 jobs with the same output. Partially fixed through Diego's script. Take care when do ACDC's.
  • Workflow issues:
    • Stuck workflows:
      • Unnoticed site problem stuck LogCollect and Cleanup jobs
        • Adding information to the stuck requests script as we discover the problems to make it easier to debug
      • Aborted workflow has many jobs to archive (vocms85)
        • Consider changing Mysql to Oracle (Ask Edgar).
      • Rejected batch due to request setup.
        • Aborted workflows are not moving to aborted-complete
        • Workflows souldn't go from aborted to abort-completed
  • Workflow that was causing massive failed jobs:
  • Oli's plot about worfkows rejected, datasets invalidated, vs total requests.

Site Problems
cms-comp-ops-site-support-team (Site Support Team) <cms-comp-ops-site-support-team@cern.ch>

Status on WebDAV_CMS deployment:

Status on the the New Subsite Mechanism update (sites number):

CMS SiteSorted ascending Contacted Replied Updated
T1_DE_KIT Yes Yes Yes
T1_IT_CNAF Yes Yes Yes
T2_DE_RWTH No No No
T2_FR_GRIF_IRFU No No No
T2_FR_GRIF_LLR No No No
T2_UK_London_Brunel No No No
T2_US_Florida No No No
T2_US_Nebraska No No No
T2_US_Purdue No No No

Status on the the site-local-config.xml update:

CMS SiteSorted ascending Contacted Replied Updated
T1_DE_KIT Yes Yes Yes
T1_ES_PIC Yes Yes Not yet
T1_FR_CCIN2P3 Yes Yes Yes
T1_IT_CNAF Yes Yes Yes
T1_RU_JINR Yes Yes Yes
T1_UK_RAL Yes Yes Yes
T1_US_FNAL Yes Not yet Not yet

Sites currently not enable LifeStatus state:

Updated 26/Jul/2021

SITE Status Duration Reason
T2_EE_Estonia Waiting Room 1 week(s) SAM + FTS Evals
T2_ES_IFCA Waiting Room 2 days(s) FTS Evals
T2_GR_Ioannina Morgue 1+ month(s) SAM + HC + FTS Evals
T2_PK_NCP Waiting Room 1+ month(s) SAM + HC + FTS Evals
T2_RU_INR Waiting Room 3 days(s) SAM + FTS Evals
T2_RU_ITEP Morgue 3+ week(s) SAM + FTS Evals
T2_RU_SINP Morgue 1+ month No change

SITE Update
T2_ES_IFCA Entering WR
T2_RU_INR Entering WR

Ticket journal of last week:

TicketSorted ascending CMS Site Last update Status Subject
150482 T1_UK_RAL Thursday, 07/22 in progress enabling AREX service at ARC-C
150882 T2_IT_Pisa Friday, 07/23 solved Unable to find GlideinWMS at T
151347 T1_US_FNAL Wednesday, 07/21 closed Add WebDAV endpoint to storage
151920 T2_BR_UERJ Friday, 07/23 closed HC test failing at T2_BR_UERJ
151932 T2_IN_TIFR Tuesday, 07/20 assigned Transfers failing to T2_IN_TIF
151972 T2_US_UCSD Wednesday, 07/21 in progress Checksum errors with files at
151996 T1_FR_CCIN2P3 Tuesday, 07/20 closed WebDAV protocol deployed (T1_F
152033 T2_CH_CSCS Wednesday, 07/21 closed Erroneous consistency check en
152034 T2_FR_IPHC Wednesday, 07/21 closed Erroneous consistency check en
152037 T2_US_Nebraska Thursday, 07/22 assigned Erroneous consistency check en
152048 T2_FR_GRIF_IRFU Friday, 07/23 in progress Erroneous consistency check en
152167 T2_US_Vanderbilt Tuesday, 07/20 assigned JobSubmit errors at T2_US_Vand
152328 T2_ES_IFCA Tuesday, 07/20 waiting for reply WebDAV endpoint deployment for
152338 T2_IT_Pisa Wednesday, 07/21 reopened WN-mc tests timing out at T2_I
152417 T2_US_Nebraska Wednesday, 07/21 solved TLS Errors between SoCal and N
152658 T1_UK_RAL Monday, 07/19 closed Active limit config in FTS3@RA
152722 T3_US_Minnesota Friday, 07/23 assigned CMS Frontier squids at T3_US_M
152745 T3_US_UMiss Monday, 07/19 verified Pilots at T3_US_UMiss
152805 T2_IT_Rome Wednesday, 07/21 solved WebDAV endpoint deployment for
152806 T2_RU_INR Monday, 07/19 closed WebDAV endpoint deployment for
152807 T2_TR_METU Monday, 07/19 closed WebDAV endpoint deployment for
152819 T2_CH_CSCS Friday, 07/23 closed SAM tests for CE failing at T2
152820 T1_DE_KIT Tuesday, 07/20 closed SAM tests not executing for tw
152825 T2_FR_GRIF_IRFU Friday, 07/23 closed SAM tests failing at T2_FR_GRI
152834 T2_FR_GRIF_IRFU Thursday, 07/22 assigned Jobs not running at T2_FR_GRIF
152851 T2_UK_London_IC Monday, 07/19 closed SAM tests failing for one CE a
152853 T2_IT_Bari Monday, 07/19 closed SAM test failing for CEs at T2
152867 T2_TR_METU Tuesday, 07/20 closed VOPut tests failing at T2_TR_M
152872 T2_US_Caltech Tuesday, 07/20 closed Transfer issues with your site
152873 T1_IT_CNAF Friday, 07/23 in progress Jobs assigned to T1_IT_CNAF ar
152885 T2_DE_RWTH Friday, 07/23 closed SAM tests failing intermittent
152892 T1_ES_PIC Wednesday, 07/21 closed SAM tests for CEs failing at T
152896 T2_BR_UERJ Wednesday, 07/21 assigned Perfsonar down/not working
152897 T2_US_Florida Wednesday, 07/21 closed Perfsonar down/not working
152898 T2_RU_IHEP Thursday, 07/22 closed SAM tests for CE failing at T2
152899 T1_DE_KIT Wednesday, 07/21 closed Traceroute4/6, Tracepath4/6 fr
152900 T2_UK_SGrid_Bristol Thursday, 07/22 closed Traceroute4/6, Tracepath4/6 fr
152916 T2_ES_IFCA Friday, 07/23 closed LoadTest Transfers are failing
152918 T2_RU_IHEP Thursday, 07/22 closed Misconfigured statistics-desti
152919 T2_RU_INR Thursday, 07/22 closed Misconfigured statistics-desti
152920 T2_TR_METU Thursday, 07/22 closed Misconfigured statistics-desti
152922 T2_ES_IFCA Thursday, 07/22 waiting for reply Misconfigured statistics-desti
152924 T2_UK_London_Brunel Thursday, 07/22 closed T2_UK_London_Brunel Site Statu
152935 T2_BE_UCL Tuesday, 07/20 solved Jobs are not running at T2_BE_
152937 T2_TR_METU Friday, 07/23 closed SAM tests warning for one CE a
152938 T2_BE_IIHE Tuesday, 07/20 solved SAM tests for one CE are not b
152954 T2_BR_UERJ Tuesday, 07/20 solved SAM tests failing at T2_BR_UER
152972 T1_IT_CNAF Tuesday, 07/20 on hold Start manual cleaning of /stor
152989 T2_IT_Bari Monday, 07/19 solved SAM tests failing for one CE a
152990 T2_EE_Estonia Friday, 07/23 in progress SAM tests not being executed f
153006 T2_US_Florida Monday, 07/19 solved Start manual cleaning of /stor
153010 T2_FR_GRIF_IRFU Tuesday, 07/20 assigned Problems accessing files at T2
153013 T1_US_FNAL Monday, 07/19 solved CMS prod job failure at T3_US_
153015 T2_IT_Rome Tuesday, 07/20 verified SAM tests failing for CE at T2
153031 T2_BE_UCL Thursday, 07/22 assigned LoadTest transfers failing fro
153032 T0_CH_CERN Thursday, 07/22 in progress Destination Overwrite error at
153033 T1_US_FNAL Friday, 07/23 assigned Destination Overwrite error at
153039 T2_DE_DESY Monday, 07/19 waiting for reply Start manual cleaning of /stor
153048 T1_US_FNAL Tuesday, 07/20 solved SAM tests no executing for one
153053 T2_IT_Legnaro Monday, 07/19 solved SAM tests for CEs failing at T
153059 T2_BR_SPRACE Wednesday, 07/21 assigned Errors in accessing unmerged f
153064 T2_FR_GRIF_IRFU Monday, 07/19 solved Frontier squid down at T2_FR_G
153066 T2_PT_NCG_Lisbon Monday, 07/19 solved Frontier squid down at T2_PT_N
153074 T1_US_FNAL Tuesday, 07/20 solved Transfers are failing from T1_
153075 T2_BR_SPRACE Tuesday, 07/20 assigned WebDAV test transfer issues (T
153076 T2_UK_London_IC Monday, 07/19 assigned Update storage.json FTS sectio
153077 T2_ES_CIEMAT Tuesday, 07/20 solved SAM XRootD tests failing at T2
153078 T2_US_Wisconsin Saturday, 07/24 assigned perfsonar01.hep.wisc.edu down
153079 T2_US_UCSD Monday, 07/19 assigned perfsonar-2.t2.ucsd.edu not co
153080 T2_US_MIT Wednesday, 07/21 assigned T2_US_MIT perfSONAR hosts
153086 T2_DE_DESY Wednesday, 07/21 waiting for reply Transfers failing to T2_DE_DES
153087 T2_RU_INR Wednesday, 07/21 solved SAM tests are failing at T2_RU
153089 T2_EE_Estonia Tuesday, 07/20 assigned WebDAV protocol deployed (T2_E
153090 T2_IT_Legnaro Wednesday, 07/21 solved WebDAV protocol deployed (T2_I
153091 T2_IT_Rome Wednesday, 07/21 solved WebDAV protocol deployed (T2_I
153092 T2_RU_INR Wednesday, 07/21 solved WebDAV protocol deployed (T2_R
153093 T2_TR_METU Thursday, 07/22 in progress WebDAV protocol deployed (T2_T
153095 T2_AT_Vienna Thursday, 07/22 solved SAM tests not being executed a
153099 T2_BR_UERJ Thursday, 07/22 assigned Transfers timing out and SRM f
153101 T2_ES_IFCA Thursday, 07/22 solved SRM tests and transfers failin
153103 T2_DE_RWTH Wednesday, 07/21 in progress Transfer fails from T2_DE_RWTH
153104 T2_ES_IFCA Friday, 07/23 reopened Transfers from T2_ES_IFCA are
153105 T2_IN_TIFR Wednesday, 07/21 assigned Transfers from T2_IN_TIFR are
153109 T1_ES_PIC Thursday, 07/22 assigned Destination Overwrite error at
153110 T1_IT_CNAF Thursday, 07/22 in progress Destination Overwrite error at
153112 T2_TR_METU Thursday, 07/22 in progress LoadTest Transfers are failing
153114 T2_RU_INR Friday, 07/23 in progress SRM-VOPut tests failing at T2_
153121 T1_US_FNAL Thursday, 07/22 assigned Archiving Issues (T1_US_FNAL_T
153141 T2_KR_KISTI Saturday, 07/24 in progress Transfers are failing from T2_
153142 T2_RU_INR Saturday, 07/24 in progress SAM tests for one CE failing a
153143 T2_BR_UERJ Saturday, 07/24 assigned XRootD tests failiing at T2_BR

Number of tickets: 91, Generated on 26/Jul/2021 (GMT)
AAA WAN Access CAF Operations Central Workflows
Data Transfers Facilities HammerCloud
Register New CMS Site SAM tests Submission Infrastructure
Tier-1 Tape Families

Sites with open GGUS tickets:

CMS SiteSorted ascending Number of Tickets Tickets
Generated on 26/Jul/2021 (GMT), Total number of tickets: 73
T0_CH_CERN 2 153032 151871
T1_DE_KIT 1 151904
T1_ES_PIC 1 152979
T1_FR_CCIN2P3 1 151228
T1_IT_CNAF 3 152972 152873 152952
T1_UK_RAL 3 152788 150399 150482
T1_US_FNAL 2 153033 152183
T2_AT_Vienna 1 152189
T2_BE_UCL 2 153031 148354
T2_BR_SPRACE 2 152895 151903
T2_BR_UERJ 2 152896 150519
T2_CN_Beijing 1 152030
T2_DE_DESY 2 152316 153039
T2_DE_RWTH 1 152991
T2_EE_Estonia 4 152990 151268 150523 152917
T2_ES_IFCA 2 152328 152922
T2_FR_GRIF_IRFU 4 152048 152834 151272 153010
T2_GR_Ioannina 2 152029 150722
T2_IN_TIFR 1 151932
T2_IT_Bari 1 150724
T2_IT_Legnaro 3 150725 151275 153024
T2_IT_Pisa 4 152338 151168 152787 148240
T2_IT_Rome 1 153026
T2_PK_NCP 2 151063 150075
T2_PL_Swierk 1 151277
T2_PL_Warsaw 1 151169
T2_PT_NCG_Lisbon 2 152992 149275
T2_RU_INR 1 152692
T2_RU_ITEP 2 150486 150331
T2_TR_METU 1 152036
T2_UA_KIPT 1 150729
T2_UK_London_Brunel 1 152031
T2_UK_London_IC 1 153025
T2_UK_SGrid_Bristol 2 152644 150734
T2_US_MIT 1 152966
T2_US_Nebraska 2 152037 152741
T2_US_UCSD 1 151972
T2_US_Vanderbilt 3 153014 152167 152893
T3_TW_NCU 1 150488
T3_TW_NTU_HEP 1 151300
T3_US_Minnesota 1 152722
T3_US_TACC 1 152790
T3_US_TAMU 1 150296

Updated on: 2021-07-12 at 00:46:42 by HectorCamiloZambranoHernandez

  • Any problem, email site support team list (while John is in vacation).

AOB
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2016-07-22 - StephanLammel
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback