Difference: DiracForShifters (1 vs. 23)

Revision 232015-11-05 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 17 to 17
 
Changed:
<
<
>
>
 
  • IT Status Board: Incidents, announcements, changes of CERN IT infrastructure

Grid Resource Sites

Revision 222015-08-21 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 53 to 53
 

Castor Diskpool, Unstaging Files

When the castor diskpool is getting full, one needs to unstage files from it.
Changed:
<
<
Clone the ilcdirac/ops git repository and run the script
>
>
Clone the ILCDiracOps git repository and run the script
 StagerScripts/UnstageProdFiles -P'prodID' -F'REC|SIM'
prodID is the first production to check, REC|SIM the file types to unstage.
Line: 65 to 65
 
Sometimes something goes wrong and files that should not exist do exist. Jobs can fail, but their outputfiles are still picked up before they are removed. Jobs are rescheduled for no good reason, the jobs fail between uploading outputdata and creating removal requests...
Changed:
<
<
Here are some scripts to deal with these outputfiles, these are also in the ilcdirac/ops repository:
>
>
Here are some scripts to deal with these outputfiles, these are also in the ILCDiracOps repository:
  Check if Successful jobs have used the same input data for a given production:
CheckProdJobs -P'ProdID'

Revision 212015-08-03 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 17 to 17
 
Changed:
<
<
>
>
 
  • IT Status Board: Incidents, announcements, changes of CERN IT infrastructure

Grid Resource Sites

Revision 202015-06-24 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 53 to 53
 

Castor Diskpool, Unstaging Files

When the castor diskpool is getting full, one needs to unstage files from it.
Changed:
<
<
Clone the ilcdirac/ops git repository and run the script
>
>
Clone the ilcdirac/ops git repository and run the script
 StagerScripts/UnstageProdFiles -P'prodID' -F'REC|SIM'
prodID is the first production to check, REC|SIM the file types to unstage.
Line: 65 to 65
 
Sometimes something goes wrong and files that should not exist do exist. Jobs can fail, but their outputfiles are still picked up before they are removed. Jobs are rescheduled for no good reason, the jobs fail between uploading outputdata and creating removal requests...
Changed:
<
<
Here are some scripts to deal with these outputfiles, these are also in the ilcdirac/ops repository:
>
>
Here are some scripts to deal with these outputfiles, these are also in the ilcdirac/ops repository:
  Check if Successful jobs have used the same input data for a given production:
CheckProdJobs -P'ProdID'

Revision 192015-05-25 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 17 to 17
 
Changed:
<
<
>
>
 
  • IT Status Board: Incidents, announcements, changes of CERN IT infrastructure

Grid Resource Sites

Revision 182015-05-22 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 35 to 35
 
Added:
>
>
 

Revision 172015-05-06 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 38 to 38
 
Changed:
<
<

Debugging

>
>

Debugging CEs and SEs

 
Changed:
<
<

Cream Sites

voms-proxy-init --voms ilc
glite-ce-job-submit -a -r  <ComputingElement/queue> TestSite.jdl

Get CE/Qeues from lcg-infosites --vo ilc ce and TestSite.jdl

[
JobName       = "CheckSite";
Executable    = "TestScript.sh";
StdOutput     = "StdOut";
StdError      = "StdErr";
InputSandbox  = "TestScript.sh";
]
Make something up for TestScript.sh (e.g., echo Hello World)

ARC Sites

arcproxy -S ilc
arcsub -j cejobs.xml -c arc-ce02.gridpp.rl.ac.uk arcJob.xrsl
with arcJob.xrsl
&(executable="myscript.sh")
(inputFiles=(myscript.sh "/afs/cern.ch/user/s/sailer/TestSites/myscript.sh"))
(stdout="std.out")
(stderr="std.err")
(outputFiles=("std.out" "") ("std.err" ""))
And something for myscript.sh (e.g., echo Hellow World)

Accessing the Underlying SE or LCG-FileCatalog

source /cvmfs/grid.cern.ch/emi-ui-3.7.3-1_sl6v2/etc/profile.d/setup-emi3-ui-example.sh
voms-proxy-init --voms ilc
export LFC_HOST=`lcg-infosites --vo ilc lfc`

#List files/folders
gfal-ls srm://dcache-se-desy.desy.de/pnfs/desy.de/ilc/fcal
srmls -l srm://dcache-se-desy.desy.de/pnfs/desy.de/ilc/fcal/
lfc-ls -l /grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep

#list replicas
lcg-lr lfn:/grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep

#give GUID
lcg-lg lfn:/grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep


#unregister LFN
lcg-uf `lcg-lg $LFN` `lcg-lr $LFN`

>
>
How to directly access the underlying CEs for more direct debugging of issues: Cream, ARC, Globus and SEs/FileCatalogs
 
Added:
>
>
 

Checking the Status of Machines, Agents, and Services

Revision 162015-04-01 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 17 to 17
 
Changed:
<
<
>
>
 
  • IT Status Board: Incidents, announcements, changes of CERN IT infrastructure

Grid Resource Sites

Revision 152015-03-17 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 18 to 18
 
Changed:
<
<
>
>
  • IT Status Board: Incidents, announcements, changes of CERN IT infrastructure
 

Grid Resource Sites

Websites offering overview or portals for the grid:

Revision 142015-03-16 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 131 to 131
 Check if there are production files wihtout ancestors:
CheckProductionAncestry -p'prodID' -t'filetype'
Added:
>
>
Also use the dirac-transformation-cli command to check and change files for productions.
 

Revision 132015-03-13 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 109 to 109
 
When the castor diskpool is getting full, one needs to unstage files from it.
Clone the ilcdirac/ops git repository and run the script
Changed:
<
<
StagerScripts/UnstageProdFiles -P -F<REC|SIM>
>
>
StagerScripts/UnstageProdFiles -P'prodID' -F'REC|SIM'
 prodID is the first production to check, REC|SIM the file types to unstage.

Added:
>
>

Production Output Data Checking

Sometimes something goes wrong and files that should not exist do exist. Jobs can fail, but their outputfiles are still picked up before they are removed. Jobs are rescheduled for no good reason, the jobs fail between uploading outputdata and creating removal requests...

Here are some scripts to deal with these outputfiles, these are also in the ilcdirac/ops repository:

Check if Successful jobs have used the same input data for a given production:
CheckProdJobs -P'ProdID'

Check if the output from failed jobs still exists:
CheckFailedJobsFourOutputData -P'ProdID'

Check if there are production files wihtout ancestors:
CheckProductionAncestry -p'prodID' -t'filetype'

 
META TOPICMOVED by="sailer" date="1418045242" from="CLIC.DiracForAdmins" to="CLIC.DiracForShifters"

Revision 122015-03-12 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"
Changed:
<
<

ILCDIRAC monitoring pages

>
>

ILCDIRAC monitoring pages

JIRA

https://its.cern.ch/jira/browse/ILCDIRAC
 

Shifter pages to check

Check these pages for anything out of the ordinary.
Line: 52 to 58
  Make something up for TestScript.sh (e.g., echo Hello World)
Added:
>
>

ARC Sites

arcproxy -S ilc
arcsub -j cejobs.xml -c arc-ce02.gridpp.rl.ac.uk arcJob.xrsl
with arcJob.xrsl
&(executable="myscript.sh")
(inputFiles=(myscript.sh "/afs/cern.ch/user/s/sailer/TestSites/myscript.sh"))
(stdout="std.out")
(stderr="std.err")
(outputFiles=("std.out" "") ("std.err" ""))
And something for myscript.sh (e.g., echo Hellow World)
 

Accessing the Underlying SE or LCG-FileCatalog

Line: 78 to 99
 
Added:
>
>

Checking the Status of Machines, Agents, and Services

Castor Diskpool, Unstaging Files

When the castor diskpool is getting full, one needs to unstage files from it.
Clone the ilcdirac/ops git repository and run the script
StagerScripts/UnstageProdFiles -P -F<REC|SIM>
prodID is the first production to check, REC|SIM the file types to unstage.

 
META TOPICMOVED by="sailer" date="1418045242" from="CLIC.DiracForAdmins" to="CLIC.DiracForShifters"

Revision 112015-02-27 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 52 to 52
  Make something up for TestScript.sh (e.g., echo Hello World)
Added:
>
>

Accessing the Underlying SE or LCG-FileCatalog

source /cvmfs/grid.cern.ch/emi-ui-3.7.3-1_sl6v2/etc/profile.d/setup-emi3-ui-example.sh
voms-proxy-init --voms ilc
export LFC_HOST=`lcg-infosites --vo ilc lfc`

#List files/folders
gfal-ls srm://dcache-se-desy.desy.de/pnfs/desy.de/ilc/fcal
srmls -l srm://dcache-se-desy.desy.de/pnfs/desy.de/ilc/fcal/
lfc-ls -l /grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep

#list replicas
lcg-lr lfn:/grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep

#give GUID
lcg-lg lfn:/grid/ilc/prod/clic/3tev/aa_e3e3_o/gen/00003121/000/aa_e3e3_o_gen_3121_67.stdhep


#unregister LFN
lcg-uf `lcg-lg $LFN` `lcg-lr $LFN`

 
META TOPICMOVED by="sailer" date="1418045242" from="CLIC.DiracForAdmins" to="CLIC.DiracForShifters"

Revision 102015-01-22 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 29 to 29
 
Added:
>
>

Debugging

Cream Sites

voms-proxy-init --voms ilc
glite-ce-job-submit -a -r  <ComputingElement/queue> TestSite.jdl

Get CE/Qeues from lcg-infosites --vo ilc ce and TestSite.jdl

[
JobName       = "CheckSite";
Executable    = "TestScript.sh";
StdOutput     = "StdOut";
StdError      = "StdErr";
InputSandbox  = "TestScript.sh";
]
Make something up for TestScript.sh (e.g., echo Hello World)
 
META TOPICMOVED by="sailer" date="1418045242" from="CLIC.DiracForAdmins" to="CLIC.DiracForShifters"

Revision 92014-12-09 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Revision 82014-12-08 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 30 to 30
 

Added:
>
>
META TOPICMOVED by="sailer" date="1418045242" from="CLIC.DiracForAdmins" to="CLIC.DiracForShifters"

Revision 72014-11-17 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 10 to 10
 
Changed:
<
<
  • Pilot Summary: Check if sites has many aborted pilots, or not pilots at all
>
>
  • Pilot Summary: Check if sites has many aborted pilots, or no pilots at all
 
Changed:
<
<
>
>
 

Grid Resource Sites

Websites offering overview or portals for the grid:

Revision 62014-11-12 - AndreSailer

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Changed:
<
<
The following links point to useful pages to monitor the status of the system:

>
>

Shifter pages to check

Check these pages for anything out of the ordinary.
 
Changed:
<
<

>
>

Grid Resource Sites

Websites offering overview or portals for the grid:

  • GGus: Ticketing for LCG (and some OSG) sites
  • GStat: Information about Sites, their queues and resources
  • GOCDB: Downtime and Resource overview
 
Changed:
<
<
>
>

Overview sites

 
Changed:
<
<
>
>
The following links point to useful pages to monitor the status of the system:
 
Changed:
<
<
>
>
 
Deleted:
<
<

 
Deleted:
<
<
-- StephanePoss - 09-Jun-2011
 \ No newline at end of file

Revision 52013-05-31 - StephanePoss

Line: 1 to 1
 
META TOPICPARENT name="DiracUsage"

ILCDIRAC monitoring pages

Line: 26 to 26
 
Added:
>
>
  -- StephanePoss - 09-Jun-2011 \ No newline at end of file
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback