Previous page

Overview

This page contains a few instructions on how to use the data from the grid

On this page:

Download a single file

Before submitting to the grid, it is useful to test your scripts locally. To do it, one should locate and download a single test file. Here are the instructions on how to do it:

With xrdcp command:

xrdcp root://cms-xrd-global.cern.ch//store/path/to/file /some/local/path

from US, use root://cmsxrootd.fnal.gov//store/path/to/file. If global redirector is not working try xrootd-cms.infn.it

Another ways of getting a file

download from specific site

Locate the file in DAS by searching for file dataset=DATASET

file dataset=/SingleMuon/Run2017H-17Nov2017-v2/MINIAOD

Choose one site (in this example T2_US_MIT), and get file PFN by executing the following commands (replace the site and file name by one you need):

site=T2_US_MIT
lfn=/store/data/Run2017H/SingleMuon/MINIAOD/17Nov2017-v2/90000/FA9FA831-8B34-E811-BA1D-008CFAC93CFC.root
pfl=`curl -ks "https://cmsweb.cern.ch/phedex/datasvc/perl/prod/lfn2pfn?node=${site}&lfn=${lfn}&protocol=srmv2" | grep PFN | cut -d "'" -f4`

then I create a user proxy:

voms-proxy-init -voms cms

Set your UID from created proxy in /tmp/x509up_u{UID}., and then set the correct X509_USER_PROXY and copy the file:

UID=58751
env -i X509_USER_PROXY=/tmp/x509up_u$UID gfal-copy -n 1 $pfl "file:///`pwd`/miniAOD.root"

Localing a PFN (physical file name)

OR you can locate it physical file name

edmFileUtil -d /store/relval/CMSSW_10_6_4/RelValZMM_13/MINIAODSIM/PUpmx25ns_106X_upgrade2018_realistic_v9-v1/10000/DBE18AD9-E36D-B449-B659-A71362DAC57A.root

Submit jobs using crab

Crab operations

After submitting a job via the CRAB3ConfigurationFile, a folder PROJECTFOLDER will appear. You can see your submission process and do some operation using the crab commands. The full list of crab commands can be found here: CRAB3Commands

Here is the list of the most common commands:

  • Inspect how the submission process proceeds: crab status -d PROJECTFOLDER
    • In case of errors, use crab status --verboseErrors for details
    • To resubmit failed jobs with extra options: crab resubmit -d PROJECTFOLDER --maxmemory=4000 --maxjobruntime=360 --numcores=1 --jobids=1,2
    • To kill a project: crab kill -d PROJECTFOLDER
  • The results will appear in cat PROJECTFOLDER | grep config.Data.outLFNDirBase | awk '{print $3}' | sed -e 's/"\//\/eos\/cms\//g' | sed -e 's/\/group//g' | sed -e 's/"//g'

Production

To reprocess data (or MC) do the following:

  • Setup CMSSW
cmsrel CMSSW_12_4_3
cd CMSSW_12_4_3/src
cmsenv
  • Produce the config file with the cmsDriver.py:
cmsDriver.py RECO -s RAW2DIGI,L1Reco,RECO --data --era Run3 --scenario pp --conditions 124X_dataRun3_Prompt_Candidate_2022_07_26_15_08_24 --eventcontent RECO --datatier RECO --filein file:RAW.root --customise Configuration/DataProcessing/Utils.addMonitoring --python_filename=pset_rereco.py --no_exec -n -1
  • create configuration file:
import CRABClient
from CRABClient.UserUtilities import config

config = config()

config.General.requestName = 'Alignment_rereco'
config.General.workArea = 'crab_projects'
config.General.transferOutputs = True

config.JobType.pluginName = 'Analysis'
config.JobType.psetName = 'pset_rereco.py'

config.Data.inputDataset = '/ZeroBias/Run2022A-v1/RAW'
config.Data.inputDBS = 'global'
config.Data.splitting = 'LumiBased'
config.Data.unitsPerJob = 20
config.Data.runRange = '354329-354332'
config.Data.publication = True
config.Data.outLFNDirBase = '/store/user/mpitt/PROPOG/Alignment/v1'

config.Site.storageSite = "T2_CH_CERN"
  • submit the job
crab submit -c crab_cfg.py

View jobs

To view running jobs goto https://monit-grafana.cern.ch/, click on JOBSCMS Task Monitoring - Task View

Debug failed jobs

If some of the jobs have errors, you can rerun the job locally using the following commands:

  • Inspect job ID with crab status --long -d PROJECTFOLDER. If all jobs have errors, then look at the first few jobs. Here we rerun for --jobid=0
  • To see the job log, run crab getlog --short -d PROJECTFOLDER --jobid=0 and inspect the PROJECTFOLDER/crab.log file

To resubmit the job locally:

  • run crab preparelocal -d PROJECTFOLDER.
  • from PROJECTFOLDER/local execute run_job.sh 1 to run the first job, the job will be executed locally after unpacking CMSSW setup.
  • kill the process with Ctrl+C
  • run cmsRun -j FrameworkJobReport.xml PSet.py to inspect the output.

To inspect memory usage (you are limited to 2GB by default), execute ps aux in a different shell.

cd JOBNAME/inputs
cmsRun PSet.py

Crabcache

Before executing these lines, run export X509_USER_PROXY=/tmp/x509up_u58751

  • To get all the files uploaded by a user to the crabcache and the amount of quota (in bytes) he's using:
curl -X GET 'https://cmsweb.cern.ch/crabcache/info?subresource=userinfo&username=mpitt' --key $X509_USER_PROXY --cert $X509_USER_PROXY -k
  • To get more information about one specific file (the file must be owned by the user who makes the query):
curl -X GET 'https://cmsweb.cern.ch/crabcache/info?subresource=fileinfo&hashkey=697a932e19bd2912710fe0322de3eff41a5553f1f9820117a8262f0ebcd3640a' --key $X509_USER_PROXY --cert $X509_USER_PROXY -k
  • To remove a specific file (currently you can only remove your files. In the future power users should be able to remove everything):
curl -X GET 'https://cmsweb.cern.ch/crabcache/info?subresource=fileremove&hashkey=697a932e19bd2912710fe0322de3eff41a5553f1f9820117a8262f0ebcd3640a' --key $X509_USER_PROXY --cert $X509_USER_PROXY -k
  • To get the quota each user has in MegaBytes:
curl -X GET 'https://cmsweb.cern.ch/crabcache/info?subresource=basicquota' --key $X509_USER_PROXY --cert $X509_USER_PROXY -k

Restoring tast folders:

To get full task list execute: crab tasks

  • To restore lost folder: crab remake --task=XXX
  • To clean the cache of killed job crab purge FOLDER

Obtaining Luminosity per dataset

From crab report, the location of JSON-formatted report file is listed. Copy this file to lxplus: cp PROJECTFOLDER/results/processedLumis.json .

#setup BRIL (for the first time run pip install)
export PATH=$HOME/.local/bin:/cvmfs/cms-bril.cern.ch/brilconda/bin:$PATH
# pip install --install-option="--prefix=$HOME/.local" brilws
# get lumi from the crab submission:
brilcalc lumi -b "STABLE BEAMS" -i processedLumis.json -c /cvmfs/cms.cern.ch/SITECONF/T0_CH_CERN/JobConfig/site-local-config.xml -u /fb

Using DAS

Several option exists to retrieve info about a dataset, here is an example for finding AOD parent file of miniAOD file:

for f in `dasgoclient --query="parent file=/store/data/Run2017D/SingleElectron/MINIAOD/09Aug2019_UL2017-v1/50000/FD85D6D5-1095-EE44-9BDF-202A69E0F25C.root"`; do
dasgoclient --query="child file=$f" | grep AOD/09Aug2019_UL
done

Remove button, small The additional option --normtag /afs/cern.ch/user/l/lumipro/public/Normtags/normtag_DATACERT.json is not working for me...

Accessing grid files in condor:

To use local condor batch to analyze files located at remote sites add use_x509userproxy = true in condor jdl file and setup proxy in your run file (recommended to set the proxy path first):

export X509_USER_PROXY=${HOME}/private/.x509up_${UID}
echo YOURPASSWORD | voms-proxy-init -voms cms -rfc -out ${HOME}/private/.x509up_${UID} -valid 192:00

debug condor jobs

Full list of jobs condor_q -nobatch
To connect to a running job: condor_ssh_to_job JobId
If jobs on hold: condor_q -hold -af HoldReason

update SSH key in github

in linux run
ssh-keygen -t rsa
cat /afs/cern.ch/user/m/mpitt/.ssh/id_rsa.pub
Go to github, settings, SSH key, add new key, and copy the content of /afs/cern.ch/user/m/mpitt/.ssh/id_rsa.pub file

-- MichaelPitt - 2019-12-08

Edit | Attach | Watch | Print version | History: r21 < r20 < r19 < r18 < r17 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r21 - 2022-08-02 - MichaelPitt
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback