TWiki
>
CMSPublic Web
>
SWGuide
>
WorkBook
>
WorkBookCRAB2Tutorial
(2016-04-19,
FedericaFanzago
)
(raw view)
E
dit
A
ttach
P
DF
<!-- /ActionTrackerPlugin --> <LINK href="/twiki/pub/TWiki/KupuContrib/kuputwiki.css" type=text/css rel=stylesheet> <style type="text/css" media="all"> pre { text-align: left; padding: 10px; color: black; font-size: 12px; } pre.command {background-color: lightgrey;} pre.cfg {background-color: lightblue;} pre.code {background-color: lightpink;} pre.output {background-color: lightgreen;} div{ font-family:arial,verdana,sans-serif; font-size:13px; margin-top:15px; margin-bottom:15px; width: 100%; white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ } div.command {background-color: lightgrey;} div.cfg {background-color: lightblue;} div.code {background-color: lightpink;} div.output {background-color: lightgreen;} </style> ---+ 5.6.1 Running CMSSW code on the Grid using !CRAB2 (%RED%for CRAB3 tutorial please click [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookCRAB3Tutorial][HERE]] %ENDCOLOR%) %COMPLETE5% %BR% [[#ReviewStatus][Detailed Review status]] ---++ WARNING * *You should always use [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrab#How_to_get_CRAB][latest production CRAB version]]* * %RED%This tutorial *is* outdated%ENDCOLOR% since it was prepared for a live lesson at a specific time and thus refers to a particular dataset and CMSSW version that may not be available when you read this (and where you try it). * *as of 2014 you should be able to kickstart your Crab work using CMSSW 5_3_11 and the dataset /GenericTTbar/HC-CMSSW_5_3_1_START53_V5-v1/GEN-SIM-RECO as MC data and /SingleMu/Run2012B-13Jul2012-v1/AOD as real data.* <!-- ---++ Contents * [[#PreRequisites][Prerequisites to run the tutorial]] * [[#SetUpEnv1][ Recipe for the tutorial]] * [[#SetUpEnv][Setup local Environment and prepare user analysis code]] * [[#SetUpCRABEnv][CRAB setup]] * [[#LocateCfg][Locate the dataset and prepare !CRAB submission]] * [[#SelectData][ Select data to access]] * [[#SetConfiguration][CRAB configuration]] * [[#SeCopy][Run CRAB on CMS.MonteCarlo data copying the output to an SE]] * [[#Conf1][CRAB configuration file for CMS.MonteCarlo data]] * [[#SetRunCrab][Run Crab]] * [[#JobCreation][Job Creation]] * [[#CMS.JobSubmission][Job Submission]] * [[#JobStatusCheck][Job Status Check]] * [[#JobOutputRetrieval][Job Output Retrieval]] * [[#CrabReport][Report information]] * [[#CopyData][Copy the output from the SE to the local User Interface]] * [[#CrabPub][Run Crab publishing]] * [[#RealData][Run !CRAB on real data copying the output to an SE]] * [[#Conf2][CRAB configuration file for real data with lumi mask]] * [[#JobCreation2][Job Creation]] * [[#CMSJobSubmission2][Job Submission]] * [[#JobStatusCheck2][Job Status Check]] * [[#JobOutputRetrieval2][Job Output Retrieval]] * [[#CrabReport2][Report information]] * [[#CopyData2][Copy the output from the SE to the local User Interface]] * [[#JustSe][Run Crab retrieving your output (without copying to a Storage Element)]] * [[#MoreDoc][Where to find more on CRAB]] --> Contents: %TOC% #PreRequisites ---++ Prerequisites to run the tutorial * to have a valid Grid certificate * to be registered to the CMS virtual organization * *to get the Grid certificate and to register to VO CMS please follow the [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideRunningGridPrerequisites][CRAB howto instructions]]* * to be registered to the siteDB * *please follow the instruction at [[https://twiki.cern.ch/twiki/bin/view/CMS/SiteDBForCRAB][siteDB registration for CRAB]]* * to have access to lxplus machines or to an SLC5 User Interface #SetUpEnv1 ---++ Recipe for the tutorial For this tutorial we will refer to !CMS software: * *CMSSW_5_3_11* and we will use an already prepared CMSSW analysis code to analyze the sample: <!-- * The tutorial will focus on the basic workflow using the dataset: _/RelValBeamHalo/CMSSW_5_2_5_cand1-START52_V9-v1/GEN-SIM-DIGI-RAW_ (MC dataset) and _/SingleMu/Run2012B-TOPMuPlusJets-PromptSkim-v1/AOD_ (real data): [[#RealData][CRAB configuration file for real data with lumi mask]] --> * The tutorial will focus on the basic workflow using the dataset: _/RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO_ (MC dataset) and _/SingleMu/Run2012B-13Jul2012-v1/AOD_ (real data): [[#RealData][CRAB configuration file for real data with lumi mask]] We will use the central installation of CRAB available at CERN: <!-- * *CRAB_2_8_1* --> * *CRAB_2_9_1* The example is written to use the _csh_ shell family. If you want to use the Bourne Shell replace _csh_ with _sh_. *Legend of colors for this tutorial* <verbatim class="command"> BEIGE background for the commands to execute (cut&paste) </verbatim> <verbatim style="font-size: 13px" class="output"> GREEN background for the output sample of the executed commands (nearly what you should see in your terminal) </verbatim> %SYNTAX{ syntax="sh" style="width:200"}% BLUE background for the configuration files (cut&paste) %ENDSYNTAX% #SetUpEnv ---++ Setup local Environment and prepare user analysis code In order to submit jobs to the Grid, you *must* have an access to a LCG User Interface (LCG UI). It will allow you to access WLCG-affiliated resources in a fully transparent way. LXPLUS users can get an LCG UI via AFS by: <verbatim class="command"> source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.csh </verbatim> Install CMSSW project in a directory of your choice. In this case we create a "TESTfirst " directory: <verbatim class="command"> mkdir TEST cd TEST cmsrel CMSSW_5_3_11 #cmsrel is an alias of scramv1 project CMSSW CMSSW_5_3_11 cd CMSSW_5_3_11/src/ cmsenv #cmsenv is an alias for scramv1 runtime -csh </verbatim> For this tutorial we are going to use as CMSSW configuration file, the tutorial.py: %SYNTAX{ syntax="python"}% import FWCore.ParameterSet.Config as cms process = cms.Process('Slurp') process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring()) process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(10) ) process.options = cms.untracked.PSet( wantSummary = cms.untracked.bool(True) ) process.output = cms.OutputModule("PoolOutputModule", outputCommands = cms.untracked.vstring("drop *", "keep recoTracks_*_*_*"), fileName = cms.untracked.string('outfile.root'), ) process.out_step = cms.EndPath(process.output) %ENDSYNTAX% #SetUpCRABEnv ---++ !CRAB setup %BLUE%Setup on lxplus:%ENDCOLOR% In order to setup and use !CRAB from any directory, source the script =crab.(c)sh= located in =/afs/cern.ch/cms/ccs/wm/scripts/Crab/=, which always points to the latest version of !CRAB. After the source of the script it's possible to use !CRAB from any directory (typically use it on your CMSSW working directory). <verbatim class="command"> source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.csh </verbatim> *Warning*: in order to have the correct environment, the order to source env files has always to be * source of UI env * setup of CMSSW software * source of !CRAB env #LocateCfg ---++ Locate the dataset and prepare !CRAB submission In order to run our analysis over a whole dataset, we have to find first the data name and then put it on the =crab.cfg= configuration file. #SelectData ---+++ Data selection To select data you want to access, use the *DAS* web page where available datasets are listed [[https://cmsweb.cern.ch/das/][Data Aggregation Service (DAS)]] . For this tutorial we'll use : <!-- /RelValBeamHalo/CMSSW_5_2_5_cand1-START52_V9-v1/GEN-SIM-DIGI-RAW --> %SYNTAX{ syntax="sh" style="width:200"}% /RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO (MC data) %ENDSYNTAX% * Beware: datasets availability as sites changes with time, if you are trying to follow this tutorial after the date it was given, you may need to use another one #SetConfiguration ---++ !CRAB configuration Modify the !CRAB configuration file =crab.cfg= according to your needs: a fully documented template is available at =$CRABPATH/full_crab.cfg=, a template with essential parameters is available at =$CRABPATH/crab.cfg=. The default name of configuration file is =crab.cfg=, but you can rename it as you want. *Copy one of these files in your local area*. For guidance, see the list and description of configuration parameters in the on-line [[http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/Docs/crab-online-manual.html][CRAB manual]]. For this tutorial, the only relevant sections of the file are =[CRAB]=, =[CMSSW]= and =[USER]= . #MainConfiguration ---+++ Configuration parameters The list of the main parameters you need to specify on your =crab.cfg=: * *pset*: the CMSSW configuration file name; * *output_file*: the output file name produced by your pset; if in the !CMSSW pset the output is defined in !TFileService, the file is automatically handled by !CRAB, and there is no need to specify it on this parameter; * *datasetpath*: the full dataset name you want to analyze; * <i><b>Jobs splitting</b></i>: * By event: only for MC data. You need to specify 2 of these parameters: *total_number_of_events*, *number_of_jobs*, *events_per_job* * specify the _total_number_of_events_ and the _number_of_jobs_: this will assign to each job a number of events equal to _total_number_of_events/number_of_jobs_ * specify the _total_number_of_events_ and the _events_per_job_: this will assign to each job _events_per_job_ events and will calculate the number of jobs by _total_number_of_events/events_per_job_; * or you can specify the _number_of_jobs_ and the _events_per_job_; * By lumi: real data require it. You need to specify 2 of these parameters: *total_number_of_lumis*, *lumis_per_job*, *number_of_jobs* * because jobs in split-by-lumi mode process entire rather than partial files, you will often end up with fewer jobs processing more lumis than expected. Additionally, a single job cannot analyze files from multiple blocks in DBS. So these parameters are "advice" to CRAB rather than determinative. * specify the _lumis_per_job_ and the _number_of_jobs_: the total number of lumis processed will be _number_of_jobs_ x _lumis_per_job_ * or you can specify the _total_number_of_lumis_ and the _number_of_jobs_ * *lumi_mask*: the filename of a JSON file that describes which runs and lumis to process. CRAB will skip luminosity blocks not listed in the file. * *return_data*: this can be 0 or 1; if it is one you will retrieve your output files to your local working area; * *copy_data*: this can be 0 or 1; if it is one you will copy your output files to a remote Storage Element; * *local_stage_out*: this can be 0 or 1; if this is one your produced output is copied to the closeSE in the case of failure of the copy to the SE specified in your crab.cfg * *publish_data*: this can be 0 or 1; if it is one you can publish your produced data to a local !DBS; <!-- * *use_server*: the usage for crab server is deprecated now, so by default this parameter is set to 0; --> <!-- one of the available servers will be used depending on the client release; --> * *scheduler*: the name of the scheduler you want to use; * *jobtype*: the type of the jobs. #SeCopy ---++ Run CRAB on CMS.MonteCarlo data copying the output to a Storage Element The chance to copy the output to an existing *Storage Element* allows to bypass the output size limit constraint, to publish the data on a local !DBS and then to easily re-run over the published data. In order to make !CRAB copies to a Storage Element you have to add the following information on the Crab configuration file: * that we want to copy our results adding *copy_data=1* and *return_data=0* (it is not allowed to have both at 1); * add the *official CMS site name* where we are going to copy our results; the name of official CMS sites can be found in the [[https://cmsweb.cern.ch/sitedb/sitelist/][siteDB]] <!--#Conf1--> ---+++ !CRAB configuration file for CMS.MonteCarlo data You can find more details on this at the corresponding link on the [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabFaq#How_to_store_output_with_CRAB_2][CRAB FAQ page]]. The !CRAB configuration file (default name crab.cfg) should be located at the same location as the !CMSSW parameter-set to be used by !CRAB with the following content: %SYNTAX{ syntax="sh"}% [CMSSW] total_number_of_events = 10 number_of_jobs = 5 pset = tutorial.py datasetpath = /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO output_file = outfile.root [USER] return_data = 0 copy_data = 1 storage_element = T2_xx_yyyy (to change with the CMS name of site where you can write outputs) user_remote_dir = TutGridSchool [CRAB] scheduler = remoteGlidein jobtype = cmssw %ENDSYNTAX% #SetRunCrab ---+++Run Crab Once your =crab.cfg= is ready and the whole underlying environment is set up, you can start running !CRAB. !CRAB supports command line help which can be useful for the first time. You can get it via: <verbatim class="command"> crab -h </verbatim> #JobCreation ---+++ Job Creation The job creation checks the availability of the selected dataset and prepares *all* the jobs for submission according to the selected job splitting specified in the crab.cfg * By default the creation process creates a !CRAB project directory (default: crab_0_date_time) in the current working directory, where the related crab configuration file is cached for further usage, avoiding interference with other (already created) projects * Using the [USER] _ui_working_dir_ parameter in the configuration file !CRAB allows the user to chose the project name, so that it can be used later to distinguish multiple !CRAB projects in the same directory. <verbatim class="command"> crab -create </verbatim> that takes by default the configuration file called crab.cfg associated for this tutorial with MC data. The creation command could ask for proxy/myproxy passwords the first time you use it and it should produce a similar screen output like: <!-- [lxplus444] $ crab -create crab: Version 2.8.1 running on Tue Jul 24 17:59:34 2012 CET (15:59:34 UTC) crab. Working options: scheduler glite job type CMSSW server ON (use_server) working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_120724_175934/ Enter GRID pass phrase: Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=fanzago/CN=610896/CN=Federica Fanzago Creating temporary proxy .............................. Done Contacting lcg-voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "cms" Failed Error: cms: Problems in DB communication. Trying next server for cms. Creating temporary proxy ................................................................. Done Contacting voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch] "cms" Done Creating proxy ........................ Done Your proxy is valid until Wed Aug 1 17:59:40 2012 crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested dataset: /RelValBeamHalo/CMSSW_5_2_5_cand1-START52_V9-v1/GEN-SIM-DIGI-RAW has 9000 events in 1 blocks. crab: May not create the exact number_of_jobs requested. crab: 5 job(s) can run on 10 events. crab: List of jobs and available destination sites: Block 1: jobs 1-5: sites: T2_CH_CERN crab: Checking remote location crab: Creating 5 jobs, please wait... crab: Total of 5 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_120724_175934/log/crab.log * the project directory called crab_0_120724_175934 is created --> <verbatim style="font-size: 13px" class="output"> $ crab -create crab: Version 2.9.1 running on Fri Oct 11 15:33:18 2013 CET (13:33:18 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested dataset: /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO has 9513 events in 1 blocks. crab: SE black list applied to data location: ['srm-cms.cern.ch', 'srm-cms.gridpp.rl.ac.uk', 'T1_DE', 'T1_ES', 'T1_FR', 'T1_IT', 'T1_RU', 'T1_TW', 'cmsdca2.fnal.gov', 'T3_US_Vanderbilt_EC2'] crab: May not create the exact number_of_jobs requested. crab: 5 job(s) can run on 10 events. crab: List of jobs and available destination sites: Block 1: jobs 1-5: sites: T2_CH_CERN, T1_US_FNAL_MSS crab: Checking remote location crab: Creating 5 jobs, please wait... crab: Total of 5 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/log/crab.log </verbatim> * the project directory called crab_0_131011_153317 is created <!-- $ crab -create crab: Version 2.8.5 running on Wed Feb 20 17:39:32 2013 CET (16:39:32 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ Enter GRID pass phrase: Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=fanzago/CN=610896/CN=Federica Fanzago Creating temporary proxy .................................. Done Contacting voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch] "cms" Done Creating proxy ............................................... Done Your proxy is valid until Thu Feb 28 17:40:02 2013 verify if user DN is mapped in CERN's SSO OK. user ready for SiteDB switchover on March 12, 2013 crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested dataset: /RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO has 300000 events in 1 blocks. crab: May not create the exact number_of_jobs requested. crab: 5 job(s) can run on 50 events. crab: List of jobs and available destination sites: Block 1: jobs 1-5: sites: T2_HU_Budapest, T2_CH_CSCS, T2_ES_IFCA, T2_FR_CCIN2P3, T2_IT_Bari, T2_RU_SINP, T3_IT_Bologna, T2_KR_KNU, T2_UK_SGrid_Bristol, T2_FR_GRIF_LLR, T2_RU_INR, T2_CN_Beijing, T2_US_MIT, T2_RU_PNPI, T2_TR_METU, T2_UK_London_IC, T2_DE_DESY, T2_TW_Taiwan, T2_US_UCSD, T2_RU_RRC_KI, T2_PL_Warsaw, T2_PT_LIP_Lisbon, T2_US_Caltech, T2_PT_NCG_Lisbon, T2_BR_SPRACE, T2_IT_Rome, T2_US_Purdue, T2_BE_IIHE, T2_IT_Legnaro, T2_ES_CIEMAT, T2_DE_RWTH, T2_RU_JINR, T2_CH_CERN, T2_FR_GRIF_IRFU, T2_UA_KIPT, T2_UK_SGrid_RALPP, T2_PK_NCP, T2_UK_London_Brunel, T2_RU_IHEP, T2_IT_Pisa, T2_IN_TIFR, T2_US_Vanderbilt, T2_US_Florida, T2_RU_ITEP, T2_FR_IPHC, T2_BE_UCL, T2_US_Wisconsin, T2_US_Nebraska, T3_UK_London_RHUL, T2_FI_HIP, T2_EE_Estonia crab: Checking remote location crab: Creating 5 jobs, please wait... crab: Total of 5 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> #JobSubmission ---+++ Job Submission With the submission command it's possible to specify a combination of jobs and job-ranges separated by comma (e.g.: =1,2,3-4), the default is all. To submit all jobs of the last created project with the default name, it's enough to execute the following command: <verbatim class="command"> crab -submit </verbatim> to submit a specific project: <verbatim class="command"> crab -submit -c <dir name> </verbatim> which should produce a similar screen output like: <verbatim class="output" style="font-size: 13px"> $ crab -submit crab: Version 2.9.1 running on Fri Oct 11 15:33:34 2013 CET (13:33:34 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: Checking available resources... crab: Found compatible site(s) for job 1 crab: 1 blocks of jobs will be submitted crab: remotehost from Avail.List = vocms83.cern.ch crab: contacting remote host vocms83.cern.ch crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: COPY FILES TO REMOTE HOST crab: SUBMIT TO REMOTE GLIDEIN FRONTEND Submitting 5 jobs 100% [====================================================================================================================================================] please wait crab: Total of 5 jobs submitted. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/log/crab.log </verbatim> <!-- $ crab -submit crab: Version 2.8.5 running on Wed Feb 20 17:42:10 2013 CET (16:42:10 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ crab: Checking available resources... crab: Found compatible site(s) for job 1 crab: 1 blocks of jobs will be submitted crab: remotehost from Avail.List = submit-2.t2.ucsd.edu crab: contacting remote host submit-2.t2.ucsd.edu crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: COPY FILES TO REMOTE HOST crab: SUBMIT TO REMOTE GLIDEIN FRONTEND Submitting 5 jobs 100% [=================================================================================================] please wait crab: Total of 5 jobs submitted. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> #JobStatusCheck ---+++ Job Status Check Check the status of the jobs in the latest !CRAB project with the following command: <verbatim class="command"> crab -status </verbatim> to check a specific project: <verbatim class="command"> crab -status -c <dir name> </verbatim> which should produce a similar screen output like: <verbatim style="font-size: 13px" class="output"> $ crab -status crab: Version 2.9.1 running on Fri Oct 11 15:42:49 2013 CET (13:42:49 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: Checking the status of all jobs: please wait crab: contacting remote host vocms83.cern.ch crab: ID END STATUS ACTION ExeExitCode JobExitCode E_HOST ----- --- ----------------- ------------ ---------- ----------- --------- 1 N Running SubSuccess cmsosgce.fnal.gov 2 N Running SubSuccess cmsosgce.fnal.gov 3 N Running SubSuccess cmsosgce.fnal.gov 4 N Running SubSuccess cmsosgce.fnal.gov 5 N Running SubSuccess cmsosgce.fnal.gov crab: 5 Total Jobs >>>>>>>>> 5 Jobs Running List of jobs Running: 1-5 crab: You can also follow the status of this task on : CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=fanzago_crab_0_131011_153317_hg41w0 Your task name is: fanzago_crab_0_131011_153317_hg41w0 Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/log/crab.log </verbatim> <!-- $ crab -status crab: Version 2.8.5 running on Wed Feb 20 17:43:04 2013 CET (16:43:04 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ crab: Checking the status of all jobs: please wait crab: contacting remote host submit-2.t2.ucsd.edu crab: ID END STATUS ACTION ExeExitCode JobExitCode E_HOST ----- --- ----------------- ------------ ---------- ----------- --------- 1 N Submitted SubSuccess 2 N Submitted SubSuccess 3 N Submitted SubSuccess 4 N Submitted SubSuccess 5 N Submitted SubSuccess crab: 5 Total Jobs >>>>>>>>> 5 Jobs Submitted List of jobs Submitted: 1-5 crab: You can also follow the status of this task on : CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=fanzago_crab_0_130220_173930_68zw1c Your task name is: fanzago_crab_0_130220_173930_68zw1c Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> #JobOutputRetrieval ---+++ Job Output Retrieval For the jobs which are in the "Done" status it is possible to retrieve the log files of the jobs (just the log files, because the output files are copied to the Storage Element associated to the T2 specified on the crab.cfg and infact return_data is 0). The following command retrieves the log files of all "Done" jobs of the last created !CRAB project: <verbatim class="command"> crab -getoutput </verbatim> to get the output of a specific project: <verbatim class="command"> crab -getoutput -c <dir name> </verbatim> the job results (CMSSW_n.stdout, CMSSW_n.stderr and crab_fjr_n.xml) will be copied in the =res= subdirectory of your crab project: <verbatim style="font-size: 13px" class="output"> $ crab -get crab: Version 2.9.1 running on Fri Oct 11 16:17:23 2013 CET (14:17:23 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: contacting remote host vocms83.cern.ch crab: Preparing to rsync 2 files crab: Results of Jobs # 1 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131011_153317/res/ crab: contacting remote host vocms83.cern.ch crab: Preparing to rsync 8 files crab: Results of Jobs # 2 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/crab_0_131011_153317/res/ crab: Results of Jobs # 3 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/crab_0_131011_153317/res/ crab: Results of Jobs # 4 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/crab_0_131011_153317/res/ crab: Results of Jobs # 5 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/crab_0_131011_153317/res/ Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/crab_0_131011_153317/log/crab.log </verbatim> <!-- $ crab -get crab: Version 2.8.5 running on Wed Feb 20 20:17:02 2013 CET (19:17:02 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ crab: contacting remote host submit-2.t2.ucsd.edu crab: RETRIEVE FILE out_files_1.tgz for job #1 crab: RETRIEVE FILE crab_fjr_1.xml for job #1 crab: Results of Jobs # 1 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ crab: contacting remote host submit-2.t2.ucsd.edu crab: RETRIEVE FILE out_files_2.tgz for job #2 crab: RETRIEVE FILE crab_fjr_2.xml for job #2 crab: RETRIEVE FILE out_files_3.tgz for job #3 crab: RETRIEVE FILE crab_fjr_3.xml for job #3 crab: RETRIEVE FILE out_files_4.tgz for job #4 crab: RETRIEVE FILE crab_fjr_4.xml for job #4 crab: RETRIEVE FILE out_files_5.tgz for job #5 crab: RETRIEVE FILE crab_fjr_5.xml for job #5 crab: Results of Jobs # 2 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ crab: Results of Jobs # 3 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ crab: Results of Jobs # 4 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ crab: Results of Jobs # 5 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> The stderr is an empty file, the stdout is the output of the wrapper of your analysis code (the output of CMSSW.sh script created by CRAB) and the crab_fjr.xml is the FrameworkJobReport created by your analysis code. #CrabReport ---+++ Use the -report option Print a short report about the task, namely the total number of events and files processed/requested/available, the name of the dataset path, a summary of the status of the jobs, and so on. A summary file of the runs and luminosity sections processed is written to res/. In principle -report should generate all the info needed for an analysis. Command to execute: <verbatim class='command'> crab -report </verbatim> Example of execution: <verbatim style="font-size: 13px" class="output"> $ crab -report crab: Version 2.9.1 running on Fri Oct 11 17:02:17 2013 CET (15:02:17 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: -------------------- Dataset: /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO Remote output : SE: T2_CH_CERN srm-eoscms.cern.ch srmPath: srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/user/fanzago/TutGridSchool_test/ Total Events read: 10 Total Files read: 5 Total Jobs : 5 Luminosity section summary file: /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/res/lumiSummary.json # Jobs: Retrieved:5 ---------------------------- crab: The summary file inputLumiSummaryOfTask.json about input run and lumi isn't created crab: No json file to compare </verbatim> <!-- $ crab -report crab: Version 2.8.5 running on Thu Feb 21 02:17:06 2013 CET (01:17:06 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ crab: -------------------- Dataset: /RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO Remote output : SE: T2_IT_Legnaro t2-srm-02.lnl.infn.it srmPath: srm://t2-srm-02.lnl.infn.it:8443/srm/managerv2?SFN=/pnfs/lnl.infn.it/data/cms/store/user/fanzago/TutGridSchool/ Total Events read: 50 Total Files read: 5 Total Jobs : 5 Luminosity section summary file: /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/lumiSummary.json # Jobs: Retrieved:5 ---------------------------- crab: The summary file inputLumiSummaryOfTask.json about input run and lumi isn't created crab: No json file to compare Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> The message "The summary file inputLumiSummaryOfTask.json about input run and lumi isn't created" isn't an error but a message that means input data didn't provide lumi section info, as expected for the MC data. The full srm path will allow you to know where your data has been stored and to perform operations by hand on it. As example you can delete the data using *srmrm* command and check the content of the remote directory through *srmls*. In this case the remote directory is: %SYNTAX{ syntax="sh"}% srm://srm-eoscms.cern.ch:8443/srm/v2/server?SFN=/eos/cms/store/user/fanzago/TutGridSchool_test %ENDSYNTAX% It could be necessary to substitute the ? with the "?" in the srm path, depending on the shell you are using. Additional srm commands include =srmrm=, =srmrmdir=, =srmmv=, for moving files within an srm system, =srmcp= which can copy files locally. Note that to copy files locally, =srmcp= may require the additional flag "-2" to ensure that the version 2 client is used. Here is the content of the file containing the luminosity summary _/crab_0_130220_173930/res/lumiSummary.json_: <verbatim style="font-size: 13px" class="output"> {"1": [[666666, 666666]]} </verbatim> <!-- {"1": [[39, 39]]} --> #CopyData ---+++ Copy the output from the SE to the local User Interface Option that can be used only if your output have been previously copied by CRAB on a remote SE. By default the -copyData copies your output from the remote SE to the local CRAB working directory (under res). Otherwise you can copy the output from the remote SE to another one, specifying either -dest_se=<the remote SE official name> or -dest_endpoint=<the complete endpoint of remote SE>. If dest_se is used, CRAB finds the correct path where the output can be stored. The command to execute in order to retrieve locally the remote output files to your local user interface is: <verbatim class='command'> crab -copyData ## or crab -copyData -c <dir name> </verbatim> An example of execution: <verbatim style="font-size: 13px" class="output"> $ crab -copyData crab: Version 2.9.1 running on Fri Oct 11 17:08:38 2013 CET (15:08:38 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/ crab: error detecting glite version crab: error detecting glite version crab: Copy file locally. Output dir: /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/res/ crab: Starting copy... directory/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/res/already exists crab: Copy success for file: outfile_4_1_Jlr.root crab: Copy success for file: outfile_3_1_MsR.root crab: Copy success for file: outfile_1_1_HF3.root crab: Copy success for file: outfile_2_1_cVA.root crab: Copy success for file: outfile_5_1_gAw.root Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131011_153317/log/crab.log </verbatim> <!-- $ crab -copyData crab: Version 2.8.5 running on Thu Feb 21 02:49:18 2013 CET (01:49:18 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/ crab: Copy file locally. Output dir: /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/ crab: Starting copy... directory/afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/res/already exists crab: Copy success for file: outfile_1_1_aOu.root crab: Copy failed for file: outfile_4_1_Pi9.root Copy failed because : Problem copying outfile_4_1_Pi9.root file'Permission denied!' crab: Copy success for file: outfile_2_1_bC1.root crab: Copy success for file: outfile_5_1_yna.root crab: Copy success for file: outfile_3_1_96A.root Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130220_173930/log/crab.log --> #CrabPub ---+++ Publish your result in DBS The publication of the produced data to !DBS allows to re-run over the produced data that has been published. The instructions to follow are below, and here is the link to the [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabForPublication][how to]]. You have to add to the Crab configuration file more information specifying that you (will) want to publish and the data name to publish. %SYNTAX{ syntax="sh"}% [USER] .... publish_data = 1 publish_data_name = what_you_want .... %ENDSYNTAX% Warning: * All the parameters related publication have to be added in the configuration file before creation of jobs, even if the publication step is executed after retrieving of job output. * Publication is done in the phys03 instance of DBS3. If you belong to a !PAG group, you have to publish your data to the !DBS associated to your group, checking at the [[https://twiki.cern.ch/twiki/bin/view/CMS/DBSInstanceAccessList][DBS access twiki page]] the correct !DBS url and which role in voms you need to be an allowed user. * Remember to change the _ui_working_dir_ value in the configuration file to create a new project (if you don't use the default name of crab project), otherwise the creation step will fail with the error message "project already exists, please remove it before create new task ". ---+++ Run Crab publishing your results You can also run your analysis code publishing the results copied to a remote Storage Element. Here below an example of the !CRAB configuration file, coherent with this tutorial: *For MC data* (crab.cfg) %SYNTAX{ syntax="sh"}% [CMSSW] total_number_of_events = 50 number_of_jobs = 10 pset = tutorial.py datasetpath = /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO output_file = outfile.root [USER] return_data = 0 copy_data = 1 storage_element = T2_xx_yyyy publish_data = 1 publish_data_name = FanzagoTutGrid [CRAB] scheduler = remoteGlidein jobtype = cmssw %ENDSYNTAX% And with this =crab.cfg= you can re-do the complete workflow as described before, plus the publication step: * creation * submission * status progress monitoring * output retrieval * publish the results #PublishData ---+++ Use the -publish option After having done the previous workflow untill the retrieval of you jobs, you can publish the output data that have been stored in the Storage Element indicated in the =crab.cfg= file using: <verbatim class="command"> crab -publish </verbatim> or to publish the outputs of a specific project: <verbatim class="command"> crab -publish -c <dir_name> </verbatim> It is not necessary that all the jobs are done and retrieved. You can publish your output at a different time. It will look for all the CMS.FrameworkJobReport files ( crab-project-dir/res/crab_fjr_*.xml ) produced by each job and will extract from there the information (i.e. number of events, LFN, etc.) to publish. ---++++ Publication output example %RED%The output shown below corresponds to an old output using DBS2.%ENDCOLOR% <verbatim style="font-size: 13px" class="output"> $ crab -publish crab: Version 2.9.1 running on Mon Oct 14 14:35:56 2013 CET (12:35:56 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/ crab: <dbs_url_for_publication> = https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet file_list = ['/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_1.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_2.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_3.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_4.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_5.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_6.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_7.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_8.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_9.xml', '/afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/res//crab_fjr_10.xml'] crab: --->>> Start dataset publication crab: --->>> Importing parent dataset in the dbs: /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO crab: --->>> Importing all parents level ----------------------------------------------------------------------------------- Transferring path /RelValZMM/CMSSW_5_2_1-START52_V4-v1/GEN-SIM block /RelValZMM/CMSSW_5_2_1-START52_V4-v1/GEN-SIM#24e1effb-0f0c-4557-bb46-3d5ecae691b8 ----------------------------------------------------------------------------------- ----------------------------------------------------------------------------------- Transferring path /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-DIGI-RAW-HLTDEBUG block /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-DIGI-RAW-HLTDEBUG#13e93136-29ed-11e2-9c63-00221959e7c0 ----------------------------------------------------------------------------------- ----------------------------------------------------------------------------------- Transferring path /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO block /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO#43683124-29f6-11e2-9c63-00221959e7c0 ----------------------------------------------------------------------------------- crab: --->>> duration of all parents import (sec): 552.62570405 crab: Import ok of dataset /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO crab: PrimaryDataset = RelValZMM crab: ProcessedDataset = fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1 crab: <User Dataset Name> = /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER crab: --->>> End dataset publication crab: --->>> Start files publication crab: --->>> End files publication crab: --->>> Check data publication: dataset /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER in DBS url https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet === dataset /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER === dataset description = ===== File block name: /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER#787d164e-b485-4a23-b334-a8abde3fe146 File block located at: ['t2-srm-02.lnl.infn.it'] File block status: 0 Number of files: 10 Number of Bytes: 33667525 Number of Events: 50 total events: 50 in dataset: /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_123645/log/crab.log </verbatim> Warning: Some versions of CMSSW switch off the debug mode of crab, so a lot of duplicated info can be reported at screen level. ---++++ Analyze your published data First note that: * !CRAB by default publishes all files finished correctly, including files with 0 events * !CRAB by default imports all dataset parents of your dataset You have to modify your =crab.cfg= file specifying the datasetpath name of your dataset and the dbs_url where data are published (we will assume phys03 instance of DBS3): %SYNTAX{ syntax="sh"}% [CMSSW] .... datasetpath = your_dataset_path dbs_url = phys03 %ENDSYNTAX% The creation output will be something similar to: <verbatim style="font-size: 13px" class="output"> $ crab -create crab: Version 2.9.1 running on Mon Oct 14 15:49:31 2013 CET (13:49:31 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_154931/ crab: error detecting glite version crab: error detecting glite version crab: Contacting Data Discovery Services ... crab: Accessing DBS at: https://cmsweb.cern.ch/dbs/prod/phys03/DBSReader crab: Requested dataset: /RelValZMM/fanzago-FanzagoTutGrid-f30a6bb13f516198b2814e83414acca1/USER has 50 events in 1 blocks. crab: SE black list applied to data location: ['srm-cms.cern.ch', 'srm-cms.gridpp.rl.ac.uk', 'T1_DE', 'T1_ES', 'T1_FR', 'T1_IT', 'T1_RU', 'T1_TW', 'cmsdca2.fnal.gov', 'T3_US_Vanderbilt_EC2'] crab: May not create the exact number_of_jobs requested. crab: 10 job(s) can run on 50 events. crab: List of jobs and available destination sites: Block 1: jobs 1-10: sites: T2_IT_Legnaro crab: Checking remote location crab: WARNING: The stageout directory already exists. Be careful not to accidentally mix outputs from different tasks crab: Creating 10 jobs, please wait... crab: Total of 10 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_154931/log/crab.log </verbatim> The jobs will run in the site where your USER data have been stored. <!-- $ crab -create crab: Version 2.8.5 running on Tue Mar 5 12:19:06 2013 CET (11:19:06 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_121906/ verify if user DN is mapped in CERN's SSO OK. user ready for SiteDB switchover on March 12, 2013 crab: Contacting Data Discovery Services ... crab: Accessing DBS at: https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet crab: Requested dataset: /RelValProdTTbar/fanzago-FedeTutGrid-c8295e0370df515614ca6812ce2cfe77/USER has 50 events in 1 blocks. crab: May not create the exact number_of_jobs requested. crab: 5 job(s) can run on 50 events. crab: List of jobs and available destination sites: Block 1: jobs 1-5: sites: T2_IT_Legnaro crab: Creating 5 jobs, please wait... crab: Total of 5 jobs created. </verbatim> #RealData ---++ Run !CRAB on real data copying the output to an SE Running !CRAB on real data has no major difference with running !CRAB on CMS.MonteCarlo data. The main difference is related on the configuration preparation for the !CRAB workflow, as showed in the next section. <!--#Conf2--> ---+++ !CRAB configuration file for real data with lumi mask You can find more details on this at the corresponding link on the [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabFaq#How_to_store_output_with_CRAB_2][Crab FAQ page]]. The !CRAB configuration file (default name crab.cfg) should be located at the same location as the !CMSSW parameter-set to be used by !CRAB. <!-- The dataset used is: _/Mu/Run2010A-Nov4ReReco_v1/RECO_ --> The dataset used is: _/SingleMu/Run2012B-13Jul2012-v1/AOD_ <!-- In this example it is specified the user working directory name _crab_lumi_. Here it is an example for this tutorial: --> *For real data* (crab_lumi.cfg) %SYNTAX{ syntax="sh"}% [CMSSW] lumis_per_job = 50 number_of_jobs = 10 pset = tutorial.py datasetpath = /SingleMu/Run2012B-13Jul2012-v1/AOD lumi_mask = Cert_190456-208686_8TeV_PromptReco_Collisions12_JSON.txt output_file = outfile.root [USER] return_data = 0 copy_data = 1 publish_data = 1 publish_data_name = FanzagoTutGrid_data [CRAB] scheduler = remoteGlidein jobtype = cmssw %ENDSYNTAX% <!-- lumi_mask = Cert_190456-195947_8TeV_PromptReco_Collisions12_JSON_v2.txt --> where the lumi_mask file can be downloaded with <verbatim class="command"> wget --no-check-certificate https://cms-service-dqm.web.cern.ch/cms-service-dqm/CAF/certification/Collisions12/8TeV/Prompt/Cert_190456-208686_8TeV_PromptReco_Collisions12_JSON.txt </verbatim> For the tutorial we are using a subset of run and lumi (using a lumiMask.json file). The lumi_mask file (Cert_190456-208686_8TeV_PromptReco_Collisions12_JSON.txt) contains: %SYNTAX{ syntax="sh"}% {"190645": [[10, 110]], "190704": [[1, 3]], "190705": [[1, 5], [7, 76], [78, 336], [338, 350], [353, 384]], ... "208551": [[119, 193], [195, 212], [215, 300], [303, 354], [356, 554], [557, 580]], "208686": [[73, 79], [82, 181], [183, 224], [227, 243], [246, 311], [313, 463]]} %ENDSYNTAX% #JobCreation2 ---+++ Job Creation Creating jobs for real data is analogous to montecarlo data. To not overwrite previous run for this tutorial, it is suggested to use a dedicated cfg: <verbatim class="command"> crab -create -cfg crab_lumi.cfg </verbatim> that takes as configuration file the file name specified with the option -cfg, in this case the crab_lumi.cfg associated for this tutorial with real data. <verbatim style="font-size: 13px" class="output"> $ crab -create -cfg crab_lumi.cfg crab: Version 2.9.1 running on Mon Oct 14 16:05:18 2013 CET (14:05:18 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested (A)DS /SingleMu/Run2012B-13Jul2012-v1/AOD has 14 block(s). crab: SE black list applied to data location: ['srm-cms.cern.ch', 'srm-cms.gridpp.rl.ac.uk', 'T1_DE', 'T1_ES', 'T1_FR', 'T1_IT', 'T1_RU', 'T1_TW', 'cmsdca2.fnal.gov', 'T3_US_Vanderbilt_EC2'] crab: Requested number of lumis reached. crab: 9 jobs created to run on 500 lumis crab: Checking remote location crab: Creating 9 jobs, please wait... crab: Total of 9 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/log/crab.log </verbatim> <!-- crab: Version 2.8.5 running on Tue Mar 5 14:47:56 2013 CET (13:47:56 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/ verify if user DN is mapped in CERN's SSO OK. user ready for SiteDB switchover on March 12, 2013 crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested (A)DS /SingleMu/Run2012B-TOPMuPlusJets-PromptSkim-v1/AOD has 13 block(s). crab: Requested number of lumis reached. crab: 8 jobs created to run on 500 lumis crab: Checking remote location crab: Creating 8 jobs, please wait... crab: Total of 8 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/log/crab.log --> * The project directory called crab_0_131014_160518 is created. * As explained the number of created jobs can not match the number of jobs required in the configuration file (9 created but 10 required jobs). #CMSJobSubmission2 ---+++ Job Submission Job submission is always analogous: <verbatim class="output" style="font-size: 13px"> $ crab -submit crab: Version 2.9.1 running on Mon Oct 14 16:07:59 2013 CET (14:07:59 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: Checking available resources... crab: Found compatible site(s) for job 1 crab: 1 blocks of jobs will be submitted crab: remotehost from Avail.List = submit-4.t2.ucsd.edu crab: contacting remote host submit-4.t2.ucsd.edu crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: COPY FILES TO REMOTE HOST crab: SUBMIT TO REMOTE GLIDEIN FRONTEND Submitting 9 jobs 100% [====================================================================================================================================================] please wait crab: Total of 9 jobs submitted. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/log/crab.log </verbatim> <!-- $ crab -submit crab: Version 2.8.5 running on Tue Mar 5 14:54:39 2013 CET (13:54:39 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/ crab: Checking available resources... crab: Found compatible site(s) for job 1 crab: 1 blocks of jobs will be submitted crab: remotehost from Avail.List = submit-2.t2.ucsd.edu crab: contacting remote host submit-2.t2.ucsd.edu crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: COPY FILES TO REMOTE HOST crab: SUBMIT TO REMOTE GLIDEIN FRONTEND Submitting 8 jobs 100% [=================================================================================================================] please wait crab: Total of 8 jobs submitted. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/log/crab.log --> #JobStatusCheck2 ---+++ Job Status Check Check the status of the jobs in the latest !CRAB project with the following command: <verbatim class="command"> crab -status </verbatim> to check a specific project: <verbatim class="command"> crab -status -c <dir name> </verbatim> which should produce a similar screen output like: <verbatim style="font-size: 13px" class="output"> [fanzago@lxplus0445 SLC6]$ crab -status crab: Version 2.9.1 running on Mon Oct 14 16:23:52 2013 CET (14:23:52 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: Checking the status of all jobs: please wait crab: contacting remote host submit-4.t2.ucsd.edu crab: ID END STATUS ACTION ExeExitCode JobExitCode E_HOST ----- --- ----------------- ------------ ---------- ----------- --------- 1 N Running SubSuccess ce208.cern.ch 2 N Submitted SubSuccess 3 N Running SubSuccess cream03.lcg.cscs.ch 4 N Running SubSuccess t2-ce-01.lnl.infn.it 5 N Running SubSuccess cream01.lcg.cscs.ch 6 N Running SubSuccess cream01.lcg.cscs.ch 7 N Running SubSuccess ingrid.cism.ucl.ac.be 8 N Running SubSuccess ingrid.cism.ucl.ac.be 9 N Running SubSuccess ce203.cern.ch crab: 9 Total Jobs >>>>>>>>> 1 Jobs Submitted List of jobs Submitted: 2 >>>>>>>>> 8 Jobs Running List of jobs Running: 1,3-9 crab: You can also follow the status of this task on : CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=fanzago_crab_0_131014_160518_582igd Your task name is: fanzago_crab_0_131014_160518_582igd Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/log/crab.log </verbatim> and then ... <verbatim style="font-size: 13px" class="output"> $ crab -status crab: Version 2.9.1 running on Tue Oct 15 10:53:33 2013 CET (08:53:33 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: Checking the status of all jobs: please wait crab: contacting remote host submit-4.t2.ucsd.edu crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: Establishing gsissh ControlPath. Wait 2 sec ... crab: ID END STATUS ACTION ExeExitCode JobExitCode E_HOST ----- --- ----------------- ------------ ---------- ----------- --------- 1 N Done Terminated 0 0 ce208.cern.ch 2 N Done Terminated 0 60317 cream03.lcg.cscs.ch 3 N Done Terminated 0 60317 cream03.lcg.cscs.ch 4 N Done Terminated 0 0 t2-ce-01.lnl.infn.it 5 N Done Terminated 0 60317 cream01.lcg.cscs.ch 6 N Done Terminated 0 60317 cream01.lcg.cscs.ch 7 N Done Terminated 0 0 ingrid.cism.ucl.ac.be 8 N Done Terminated 0 0 ingrid.cism.ucl.ac.be 9 N Done Terminated 0 0 ce203.cern.ch crab: ExitCodes Summary >>>>>>>>> 4 Jobs with Wrapper Exit Code : 60317 List of jobs: 2-3,5-6 See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning crab: ExitCodes Summary >>>>>>>>> 5 Jobs with Wrapper Exit Code : 0 List of jobs: 1,4,7-9 See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning crab: 9 Total Jobs crab: You can also follow the status of this task on : CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=fanzago_crab_0_131014_160518_582igd Your task name is: fanzago_crab_0_131014_160518_582igd Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131014_160518/log/crab.log </verbatim> <!-- $ crab -status crab: Version 2.8.5 running on Tue Mar 5 14:59:36 2013 CET (13:59:36 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/ crab: Checking the status of all jobs: please wait crab: contacting remote host submit-2.t2.ucsd.edu crab: ID END STATUS ACTION ExeExitCode JobExitCode E_HOST ----- --- ----------------- ------------ ---------- ----------- --------- 1 N Running SubSuccess cream02.iihe.ac.be 2 N Running SubSuccess cream02.iihe.ac.be 3 N Running SubSuccess cream02.iihe.ac.be 4 N Running SubSuccess cream02.iihe.ac.be 5 N Submitted SubSuccess 6 N Running SubSuccess cream02.iihe.ac.be 7 N Running SubSuccess cream02.iihe.ac.be 8 N Running SubSuccess red-gw2.unl.edu crab: 8 Total Jobs 1 Jobs Submitted List of jobs Submitted: 5 7 Jobs Running List of jobs Running: 1-4,6-8 crab: You can also follow the status of this task on : CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=fanzago_crab_0_130305_144756_db2r51 Your task name is: fanzago_crab_0_130305_144756_db2r51 --> #JobOutputRetrieval2 ---+++ Job Output Retrieval For the jobs which are in the "Done" status it is possible to retrieve the log files of the jobs (just the log files, because the output files are copied to the Storage Element associated to the T2 specified on the crab.cfg and infact return_data is 0). The following command retrieves the log files of all "Done" jobs of the last created !CRAB project: <verbatim class="command"> crab -getoutput </verbatim> to get the output of a specific project: <verbatim class="command"> crab -getoutput -c <dir name> </verbatim> the job results will be copied in the =res= subdirectory of your crab project: <verbatim style="font-size: 13px" class="output"> $ crab -get crab: Version 2.9.1 running on Tue Oct 15 10:53:53 2013 CET (08:53:53 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: contacting remote host submit-4.t2.ucsd.edu crab: Preparing to rsync 2 files crab: Results of Jobs # 1 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131014_160518/res/ crab: contacting remote host submit-4.t2.ucsd.edu crab: Preparing to rsync 16 files crab: Results of Jobs # 2 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 3 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 4 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 5 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 6 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 7 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 8 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ crab: Results of Jobs # 9 are in /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/ Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/log/crab.log </verbatim> <!-- crab: Version 2.8.5 running on Tue Mar 5 15:15:32 2013 CET (14:15:32 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/ crab: contacting remote host submit-2.t2.ucsd.edu crab: RETRIEVE FILE out_files_1.tgz for job #1 crab: RETRIEVE FILE crab_fjr_1.xml for job #1 crab: Results of Jobs # 1 are in /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/crab_0_130305_144756/res/ ... Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/log/crab.log --> #CrabReport ---+++ Use the -report option As for the CMS.MonteCarlo data example, it is possible to run the report command: <verbatim class='command'> crab -report -c <dir name> </verbatim> the report command returns info about correctly finished jobs, that means jobs with !JobExitCode = 0 _and_ !ExeExitCode = 0 <verbatim style="font-size: 13px" class="output"> $ crab -report crab: Version 2.9.1 running on Tue Oct 15 15:55:10 2013 CET (13:55:10 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/ crab: error detecting glite version crab: error detecting glite version crab: -------------------- Dataset: /SingleMu/Run2012B-13Jul2012-v1/AOD Remote output : SE: T2_IT_Legnaro t2-srm-02.lnl.infn.it srmPath: srm://t2-srm-02.lnl.infn.it:8443/srm/managerv2?SFN=/pnfs/lnl.infn.it/data/cms/store/user/fanzago/SingleMu/FanzagoTutGrid_data/${PSETHASH}/ Total Events read: 264540 Total Files read: 21 Total Jobs : 9 Luminosity section summary file: /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/lumiSummary.json # Jobs: Retrieved:9 ---------------------------- crab: Summary file of input run and lumi to be analize with this task: /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/res/inputLumiSummaryOfTask.json crab: to complete your analysis, you have to analyze the run and lumi reported in the //afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/missingLumiSummary.json file Log file is /afs/cern.ch/user/f/fanzago/scratch0/TUTORIAL/crab_0_131014_160518/log/crab.log </verbatim> <!-- $ crab -report crab: Version 2.8.5 running on Tue Mar 5 15:18:00 2013 CET (14:18:00 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/ crab: -------------------- Dataset: /SingleMu/Run2012B-TOPMuPlusJets-PromptSkim-v1/AOD Remote output : SE: T2_IT_Legnaro t2-srm-02.lnl.infn.it srmPath: srm://t2-srm-02.lnl.infn.it:8443/srm/managerv2?SFN=/pnfs/lnl.infn.it/data/cms/store/user/fanzago/SingleMu/FedeTutGridGlide_data/${PSETHASH}/ Total Events read: 39942 Total Files read: 29 Total Jobs : 8 Luminosity section summary file: /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/crab_0_130305_144756/res/lumiSummary.json # Jobs: Retrieved:8 ---------------------------- crab: Summary file of input run and lumi to be analize with this task: /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/crab_0_130305_144756/res//inputLumiSummaryOfTask.json crab: to complete your analysis, you have to analyze the run and lumi reported in the /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/crab_0_130305_144756/res//missingLumiSummary.json file Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_144756/log/crab.log --> where the content of files containing the luminosity info about the task are: the original lumiMask.json file written in the crab,.cfg file and used during the creation of your task <verbatim style="font-size: 13px" class="output"> $ cat Cert_190456-208686_8TeV_PromptReco_Collisions12_JSON.txt {"190645": [[10, 110]], "190704": [[1, 3]], "190705": [[1, 5], [7, 65], [81, 336], .... "208686": [[73, 79], [82, 181], [183, 224], [227, 243], [246, 311], [313, 463]]} </verbatim> the lumi sections that your created jobs have to analyze (that are info used as arguments of your jobs) <verbatim style="font-size: 13px" class="output"> $ cat crab_0_131014_160518/res/inputLumiSummaryOfTask.json {"194305": [[84, 85]], "194108": [[95, 96], [117, 120], [123, 126], [149, 152], [154, 157], [160, 161], [166, 169], [172, 174], [176, 176], [185, 185], [187, 187], [190, 191], [196, 197], [200, 201], [206, 209], [211, 212], [216, 221], [231, 232], [234, 235], [238, 243], [249, 250], [277, 278], [305, 306], [333, 334], [438, 439], [520, 520], [527, 527]], "194120": [[13, 14], [22, 23], [32, 33], [43, 44], [57, 57], [67, 67], [73, 74], [88, 89], [105, 105], [110, 111], [139, 139], [144, 144], [266, 266]], "194224": [[94, 94], [111, 111], [257, 257], [273, 273], [324, 324]], "194896": [[35, 35], [68, 69]], "194424": [[63, 63], [92, 92], [121, 121], [123, 123], [168, 173], [176, 177], [184, 185], [187, 187], [199, 200], [202, 203], [207, 207], [213, 213], [220, 221], [256, 256], [557, 557], [559, 559], [562, 562], [564, 564], [599, 599], [602, 602], [607, 607], [609, 609], [639, 639], [648, 649], [656, 656], [658, 658], [660, 660]], "194631": [[222, 222]], "193998": [[66, 113], [115, 119], [124, 124], [126, 127], [132, 137], [139, 154], [158, 159], [168, 169], [172, 172], [174, 176], [180, 185], [191, 192], [195, 196], [233, 234], [247, 247]], "194027": [[93, 93], [109, 109], [113, 115]], "194778": [[127, 127], [130, 130]], "195947": [[27, 27], [36, 36]], "195099": [[77, 77], [106, 106]], "196200": [[66, 67]], "194711": [[1, 4], [11, 17], [19, 19], [25, 30], [33, 38], [46, 49], [54, 55], [62, 62], [64, 64], [70, 71], [82, 83], [90, 91], [98, 99], [102, 103], [106, 107], [112, 115], [123, 124], [129, 130], [140, 140], [142, 142], [614, 617]], "195552": [[256, 256], [263, 263]], "195013": [[133, 133], [144, 144]], "195868": [[16, 16], [20, 20]], "194912": [[130, 131]], "194699": [[38, 39], [253, 253], [256, 256]], "194050": [[353, 354], [1881, 1881]], "194075": [[82, 82], [101, 101], [103, 103]], "194076": [[3, 6], [9, 9], [16, 17], [20, 21], [29, 30], [33, 34], [46, 47], [58, 59], [84, 87], [93, 94], [100, 101], [106, 107], [130, 131], [143, 143], [154, 155], [228, 228], [239, 240], [246, 246], [268, 269], [284, 285], [376, 377], [396, 397], [490, 491], [718, 719]], "195970": [[77, 77], [79, 79]], "195919": [[5, 6]], "194644": [[8, 9], [19, 20], [34, 35], [58, 59], [78, 79], [100, 100], [106, 106], [128, 129]], "196250": [[73, 74]], "195164": [[62, 62], [64, 64]], "194199": [[114, 115], [124, 125], [148, 148], [156, 157], [159, 159], [207, 208], [395, 395], [401, 402]], "194480": [[621, 622], [630, 631], [663, 664], [715, 716], [996, 997], [1000, 1001], [1010, 1011], [1020, 1021], [1186, 1187], [1190, 1193]], "196531": [[284, 284], [289, 289]], "195774": [[150, 150], [159, 159]], "196027": [[150, 151]], "193834": [[1, 35]], "193835": [[1, 20], [22, 26]], "193836": [[1, 2]]} </verbatim> the lumi sections really analyzed by your correctly terminated jobs <verbatim style="font-size: 13px" class="output"> $ cat crab_0_131014_160518/res/lumiSummary.json {"195947": [[27, 27], [36, 36]], "194108": [[95, 96], [119, 120], [123, 126], [154, 157], [160, 161], [166, 167], [172, 174], [176, 176], [185, 185], [187, 187], [196, 197], [211, 212], [231, 232], [238, 241], [249, 250], [277, 278], [305, 306], [333, 334], [438, 439], [520, 520], [527, 527]], "193998": [[66, 66], [69, 70], [87, 88], [90, 100], [103, 105], [108, 109], [112, 113], [115, 119], [124, 124], [126, 126], [132, 135], [139, 140], [142, 142], [144, 154], [158, 159], [168, 169], [172, 172], [174, 176], [180, 185], [191, 192], [195, 196], [233, 234]], "194224": [[94, 94], [111, 111], [257, 257]], "194424": [[63, 63], [92, 92], [121, 121], [123, 123], [168, 173], [176, 177], [184, 185], [187, 187], [207, 207], [213, 213], [220, 221], [256, 256], [599, 599], [602, 602], [607, 607], [609, 609], [639, 639], [656, 656]], "194631": [[222, 222]], "196250": [[73, 74]], "194027": [[93, 93], [109, 109], [113, 115]], "194778": [[127, 127], [130, 130]], "195099": [[77, 77], [106, 106]], "194711": [[140, 140], [142, 142]], "195552": [[256, 256], [263, 263]], "195868": [[16, 16], [20, 20]], "194912": [[130, 131]], "194699": [[253, 253], [256, 256]], "195970": [[77, 77], [79, 79]], "194076": [[3, 6], [29, 30], [33, 34], [58, 59], [84, 87], [93, 94], [106, 107], [130, 131], [154, 155], [228, 228], [239, 240], [246, 246], [268, 269], [284, 285], [718, 719]], "194050": [[353, 354], [1881, 1881]], "195919": [[5, 6]], "194644": [[34, 35], [78, 79]], "195164": [[62, 62], [64, 64]], "194199": [[114, 115], [124, 125], [148, 148], [156, 157], [159, 159], [207, 208]], "196531": [[284, 284], [289, 289]], "196027": [[150, 151]], "193834": [[1, 24], [27, 30], [33, 34]], "193835": [[19, 20], [22, 23], [26, 26]], "193836": [[1, 2]]} </verbatim> and the missing lumi (difference between the original lumiMask and lumiSummary) that you can analyze creating a new task and using this file as new lumiMask file <verbatim style="font-size: 13px" class="output"> $ cat crab_0_131014_160518/res/missingLumiSummary.json file {"190645": [[10, 110]], "190704": [[1, 3]], "190705": [[1, 5], [7, 65], [81, 336], [338, 350], [353, 383]], "190738": [[1, 130], [133, 226], [229, 355]], ..... "208541": [[1, 57], [59, 173], [175, 376], [378, 417]], "208551": [[119, 193], [195, 212], [215, 300], [303, 354], [356, 554], [557, 580]], "208686": [[73, 79], [82, 181], [183, 224], [227, 243], [246, 311], [313, 463]]} </verbatim> To create a task to analyze the missing lumis of the original lumiMask you can use the missingLumiSummary.json file as new lumiMask.json file in your crab.cfg. As before, you can decide the split you want, and using the same publish_data_name the news outputs will be published in the same dataset of previuosly task %SYNTAX{ syntax="sh"}% [CMSSW] lumis_per_job = 50 number_of_jobs = 4 pset = tutorial.py datasetpath = /SingleMu/Run2012B-13Jul2012-v1/AOD lumi_mask = crab_0_131014_160518/res/missingLumiSummary.json output_file = outfile.root [USER] return_data = 0 copy_data = 1 publish_data =1 storage_element = T2_xx_yyyy publish_data_name = FanzagoTutGrid_data [CRAB] scheduler = remoteGlidein jobtype = cmssw %ENDSYNTAX% <verbatim style="font-size: 13px" class="output"> $ crab -create -cfg crab_missing.cfg [fanzago@lxplus0445 SLC6]$ crab -create -cfg crab_data.cfg crab: Version 2.9.1 running on Tue Oct 15 17:10:16 2013 CET (15:10:16 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131015_171016/ crab: error detecting glite version crab: error detecting glite version crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested (A)DS /SingleMu/Run2012B-13Jul2012-v1/AOD has 14 block(s). crab: SE black list applied to data location: ['srm-cms.cern.ch', 'srm-cms.gridpp.rl.ac.uk', 'T1_DE', 'T1_ES', 'T1_FR', 'T1_IT', 'T1_RU', 'T1_TW', 'cmsdca2.fnal.gov', 'T3_US_Vanderbilt_EC2'] crab: Requested number of jobs reached. crab: 4 jobs created to run on 200 lumis crab: Checking remote location crab: WARNING: The stageout directory already exists. Be careful not to accidentally mix outputs from different tasks crab: Creating 4 jobs, please wait... crab: Total of 4 jobs created. Log file is /afs/cern.ch/user/f/fanzago/scratch0/TEST_RELEASE/TEST_PATC2/TEST_2_8_2/TUTORIAL/TUT_5_3_11/SLC6/crab_0_131015_171016/log/crab.log </verbatim> <!-- crab: Version 2.8.5 running on Tue Mar 5 15:22:50 2013 CET (14:22:50 UTC) crab. Working options: scheduler remoteGlidein job type CMSSW server OFF working directory /afs/cern.ch/user/f/fanzago/scratch0/TEST/crab_0_130305_152250/ verify if user DN is mapped in CERN's SSO OK. user ready for SiteDB switchover on March 12, 2013 crab: Contacting Data Discovery Services ... crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet crab: Requested (A)DS /SingleMu/Run2012B-TOPMuPlusJets-PromptSkim-v1/AOD has 13 block(s). crab: Each job will process about 192 lumis. crab: 9 jobs created to run on 1918 lumis crab: Checking remote location crab: WARNING: The stageout directory already exists. Be careful not to accidentally mix outputs from different tasks crab: Creating 9 jobs, please wait... crab: Total of 9 jobs created. --> and submit them as usual. The created jobs will analyze part of the missing lumi of the original lumiMask.json file. * If you select total_number_of_lumis = -1 instead of lumi_per_job or number_of_job, the new task will analyze all the missing lumi. #JustSe ---++ Run Crab retrieving your output (without copying to a Storage Element) You can also run your analysis code without interacting with a remote Storage Element, but retrieving the outputs to your workspace area (under the res dir of the project). Here below an example of the !CRAB configuration file, coerent with this tutorial: %SYNTAX{ syntax="sh"}% [CMSSW] total_number_of_events = 100 number_of_jobs = 10 pset = tutorial.py datasetpath = /RelValZMM/CMSSW_5_3_6-START53_V14-v2/GEN-SIM-RECO output_file = outfile.root [USER] return_data = 1 [CRAB] scheduler = remoteGlidein jobtype = cmssw %ENDSYNTAX% And with this crab.cfg in place you can re-do de workflow as described before (a part of the publication step): * creation * submission * status progress monitoring * output retrieval (in this step you'll be able to retrieve directly the real output produced by your pset file) #MoreDoc ---++ Where to find more on !CRAB * [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrab][CRAB Home]] * [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookCRAB3Tutorial][CRAB3 Tutorial]] * [[SWGuideCrabHowTo][HowTos]] * [[https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabFaq][CRAB FAQ]] * [[https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookGridJobDiagnosisTemplate][WorkBookGridJobDiagnosisTemplate]]: Steps to identify the problems you experience with your grid analysis jobs. * [[https://hypernews.cern.ch/HyperNews/CMS/get/crabFeedback.html][CRAB mailing list]] where to send feedback and ask support in case of jobs problem (please send to us your crab.cfg file and the job stderr - stdout - log otherwise we are not able to provide support) Note also that all CMS members using the Grid must subscribe to the [[https://hypernews.cern.ch/HyperNews/CMS/get/gridAnnounce.html][Grid Annoucements CMS.HyperNews forum]]. #ReviewStatus ---++ Review status | *Reviewer/Editor and Date (copy from screen)* | *Comments* | | Main.JohnStupak - 4-June-2013 | Review, minor revisions, updated real data dataset to an existing dataset | | Main.NitishDhingra - 2012-04-07 | See detailed comments below. | | Main.MattiaCinquilli - 2010-04-15 | Update for tutorial | | Main.FedericaFanzago - 18 Feb 2009 | Update for tutorial | | Main.AndriusJuodagalvis - 2009-08-21 | Added an instance of url_local_dbs | %TWISTY{mode="div" showlink="Detailed comments 07-Apr-2012 " hidelink="Hide " firststart="hide" showimgright="%ICONURLPATH{toggleopen-small}%" hideimgright="%ICONURLPATH{toggleclose-small}%"}% Complete Review, Minor Changes. Page gives a good idea of doing a physics analysis using CRAB %ENDTWISTY% %RESPONSIBLE% Main.FedericaFanzago %BR%
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
css
tutorial.css
r1
manage
0.3 K
2010-04-14 - 10:19
MattiaCinquilli
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r120
<
r119
<
r118
<
r117
<
r116
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r120 - 2016-04-19
-
FedericaFanzago
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Create
a LeftBar
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback