TWiki
>
CMSPublic Web
>
SWGuide
>
WorkBook
>
WorkBookStartingGrid
(2021-03-11,
MargueriteTonjes
)
(raw view)
E
dit
A
ttach
P
DF
---+ 5.1 Chapter Overview -- Getting Started %COMPLETE5% %BR% [[#ReviewStatus][Detailed Review status]] #GoAls ---++ Goals of this page: This page is intended to provide you with an overview of this entire Chapter, pointing out which parts are required reading to get physics analysis done on the CMS distributed analysis infrastructure, and those that are meant to provide intellectual stimuli and broader context. ---++ Contents * [[#InTro][Introduction]] * [[#BasicGrid][Basic requirements for using the Grid]] * [[#ObtainingCert][Obtaining and Installing your Certificate]] * [[#DnDnMapping][Connecting your certificate to your account]] * [[#UsingGridCert][Using your grid certificate]] * [[#LcgUi][Accessing an LCG User Interface]] * [[#PreInst][Preinstalled]] * [[#ReviewStatus][Review status]] #InTro ---++ Introduction CMS uses a globally distributed computing system for data analysis. The present Chapter has two objectives: 1 Provide you with all the information required to use the global system for physics data analysis. 1 Provide you with background information, and context, so that you start gaining some appreciation of the complexity of this system. Those who really don't care about how things work, and just want to get their analysis off the ground, may want to skip all the material provided in the interest of our second goal above. The present section is meant to make this easy for you by providing guidance on what to skip. However, let us warn you upfront that eventually, you will need that more detailed background knowledge in order to understand, and react to failures of the distributed system that you will invariably be exposed to, while using it. *The complexity of this global system guarantees that an educated and intelligent user will often be more effective in getting stuff done, than somebody who knows nothing but the basics.* ---+++ Roadmap for Chapter 5 As a new user, you should *read the "must read" chapters in the order listed*, as concepts introduced in one will often be used in the next. This is especially true for Chapters 5.4, 5.5, and 5.6. * Chapter 5.1 is a *must read*. It not only provides this roadmap, but also a discussion of the requirements to get started. * Chapter 5.2 "Grid Computing Context" can be skipped by the impatient. It provides a general introduction of "grid" computing terms. * Chapter 5.3 "Analysis Workflow" can be skipped, except for the very beginning of it. It explains how CRAB works under the hood, at least conceptually. * Chapter 5.4 "Locating Data" is a *must read*. It explains how to find the datasets to run on and how to pull a single file to your desktop, so you can try out your executable interactively and do the bulk of your debugging. * Chapter 5.5 "Data Quality Monitor" can be skipped initially. It explains how to refine the Data Finding process to include Data Quality Information * Chapter 5.6 "Data Analysis with CRAB" is a *must read*. It explains how to use CRAB, the tool to use for doing data analysis on the globally distributed CMS data analysis infrastructure. * Chapter 5.7 "Data Analysis with CMS Connect" is a *must read*. It explains how to use CMS Connect, the complementary service to CRAB for user-defined scripts via condor for doing late-stage data analysis that don't depend on cmsRun (the CMSSW executable). E.g Making histograms, plots, analyzing trees, etc. * Chapter 5.8 "Dashboard Job Monitor" is a *must read*. It explains how to monitor the status of your jobs. * Chapter 5.9 "The role of the T2s" can be skipped initially. It provides essential background to understand the disk space organization at T2s in CMS. As T2s are the places where the vast majority of data analysis in CMS takes place, it will eventually be vital for you to read this chapter carefully. * Chapter 5.10 "Transfering Data" can be skipped initially. Once you have read chapter 5.7, you will understand how disk space is managed, and can then graduate to using it in style. This Chapter explains how to request datasets to be moved to T2s and T3s. Anybody in CMS can make such requests. * Chapter 5.11 "Data Organization Explained" can be skipped initially. It explains a variety of terms that CMS uses to describe how data is organized and managed. * Chapter 5.12 "Processing by Physics Groups". It talks about priority users privileges and convenors responsibilty towards such features. * Chapter 5.13 "cmssh tutorial". A very useful tool to easily find your favorite data from the command line, copy files transparently without knowing Physical File Name location, etc. #BasicGrid ---++ Basic requirements for using the Grid The remainder of this page deals with the essentials you need before you can even start doing anything on the globally distributed CMS data analysis infrastructure. Note that initial testing and workbook exercises can be done on an LXPLUS machine (or [[WorkBookRemoteSiteSpecifics][another machine, properly configured]]), but proper analysis jobs and Monte Carlo production should be submitted to the globally distributed CMS data analysis infrastructure. Note: We will sometimes use the word "Grid" as a synonym to "globally distributed CMS data analysis infrastructure" for obvious reasons of brevity. The basic requirements for using the Grid resources are: <!-- * CERN CMS computing access ([[WorkBookGetAccount][LXPLUS account]] and [[http://cmsdoc.cern.ch/comp/comp_quick_guide.html#AccountCreation][CMS registration]]) --> * a local computer - either personal computer or a workcluster; * being a member of CMS ([[http://cmsdoc.cern.ch/comp/comp_quick_guide.html#AccountCreation][CMS registration]]); * a digital certificate from one of the Certification Authorities (CA) recognised by WLCG (for example the [[https://ca.cern.ch/ca/][CERN CA]]) * being a member in the [[SWGuideLcgAccess][CMS Virtual Organisation (CMS VO)]]; * correct mapping of grid certificate to username, see [[#DnDnMapping][below]] * access to a [[#LcgUi][grid User Interface]]; #ObtainingCert ---++ Obtaining and installing your Certificate To *obtain your certificate and join the CMS VO*, follow the steps on [[SWGuideLcgAccess][this page]]. <br> That same page also has pointers to troubleshooting help if needed. Note that it can take a few days for the certificate to be issued. The CA will give you instructions on how to load your certificate into your browser. To *setup the certificate* on the user interface from where you have to work you should: * Export the certificate from your browser to a file in p12 format. How to export the certificate is very browser dependent. It will be something like Edit or Tools -> Preferences or (Internet) Options -> Advanced -> Security or Encryption -> View Certificates -> Your Certificates. In modern Firefox you should “backup” rather than “export” the certificate. You can find more instructions and hints for various browsers in [[https://ca.cern.ch/ca/][this CERN CA help page]]. You can give any name to your p12 file (in the example below the name is =mycert.p12=). * Place the p12 certificate file in the =.globus= directory of your home area. If the =.globus= directory doesn't exist, create it. <verbatim> cd ~ mkdir .globus cd ~/.globus mv /path/to/mycert.p12 . </verbatim> * Execute the following shell commands: <verbatim> rm -f usercert.pem rm -f userkey.pem openssl pkcs12 -in mycert.p12 -clcerts -nokeys -out usercert.pem openssl pkcs12 -in mycert.p12 -nocerts -out userkey.pem chmod 400 userkey.pem chmod 400 usercert.pem </verbatim> * For openssl commands, you need to put the *same password* that you chose while importing the certificate in your browser, and you would also be asked for "Enter PEM pass phrase". One may choose to keep it same, so as to avoid password confusions :-) * Verify that it all works by executing (n.b. you may need to setup a [[#LcgUi][grid UI]] to execute this command, see below): <verbatim> voms-proxy-init --rfc --voms cms -valid 192:00 </verbatim> * Ignore a (possible) message about not being able to find a =.glite/vomses= directory. Some CAs provide the =usercert.pem= and =userkey.pem= files and then the user has to produce the p12 file to be imported to the browser. To convert the =usercert.pem= and =userkey.pem= files into a browser certificate =mycert.p12= do the following: <verbatim> openssl pkcs12 -export -in usercert.pem -inkey userkey.pem -out mycert.p12 -name "my browser cert for 2014" </verbatim> <!--<br><br> <font face="Courier" size="2" style="color:#7A4707; margin-top:15px; margin-bottom:15px"> openssl pkcs12 -export -in usercert.pem -inkey userkey.pem -out mycert.p12 -name "my browser cert for 2014" </font> --> To do CMS analysis on WLCG Grid resources, you will further require: * [[WorkBookSetComputerNode][A CMS analysis software environment]] setup on your local computer. * Some sample datasets with local access (on a hard disk or other mass data storage system) so you can test your analysis code interactively before submitting your jobs on the grid. These local datasets are frequently subsets of one of the main CMS datasets resulting from a first-pass analysis job ([[WorkBookDataFormats#Reconstructed_RECO_Data_and_Anal][RECO or AOD]]). * To stage user data back to CERN with a non-CERN certificate you need to [[https://ca.cern.ch/ca/Certificates/MapCertificate.aspx][map it to your CERN account]] (not yet enforced). All CMS members using the Grid may benefit from subscribing to the [[https://hypernews.cern.ch/HyperNews/CMS/get/gridAnnounce.html][Grid Annoucements CMS.HyperNews forum]]. #DnDnMapping ---++ Connecting your certificate to your account CMS Analysis with CRAB requires that the user's authentication credential is mapped to the a globally unique username. Currently authentication is based on grid certificate, where user is identified by the so called !DN and we use the CERN primary computing account as username. If you use a certificated from CERN, this operation is fully transparent and you need to do no nothing but be aware of what you CERN username is. If you are using a grid certificate issued by a Certification Authority other than CERN CA, then read and follow the instructions in the [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/UsernameForCRAB#Adding_your_DN_to_your_profile][Username for CRAB]] page to make sure your certificate is correctly mapped to your account. #UsingGridCert ---++ Using your grid certificate Each day you wish to use [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][xrootd]], [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrab][CRAB]], [[http://docs.uscms.org][CMS Connect]], or similar technologies, you will need to authenticate your grid certificate with the command: <verbatim> voms-proxy-init --rfc --voms cms -valid 192:00 </verbatim> #LcgUi ---++ Grid User Interface The recommended way to submit jobs on the Grid is to use !CRAB. It will allow you to access both EGI and OSG Grid resources in a fully transparent way. Minimal client as distributed by OSG or pre-installed on lxplus will do. #PreInst ---+++ Preinstalled * At CERN: * LXPLUS already has the grid commands needed for Crab, no need to issue any setup command. * Other affiliated sites and institutions may provide generally available WLCG/OSG software for grid tools (see WorkBookRemoteSiteSpecifics to look for information for your institution). #ReviewStatus ---++ Review status <!-- Add your review status in this table structure with 2 columns delineated by three vertical bars; please DON'T remove the spaces between the vertical bars and the asterisks in the first line; it would break the PDF rendering! --> | *Reviewer/Editor and Date (copy from screen)* | *Comments* | | Main.MargueriteBeltTonjes - 2020-06019 | remove GLiteUI/Install Your Own because it's SL5 and obsolete, added -valid to first use of voms-proxy-init | | Main.StefanoBelforte - 2020-02-28 | remove references to SiteDB, remove some old obsolete things| | Main.StefanoBelforte - 20 Sep 2014 | review and update grid documentation, remove duplications| | Main.StefanoBelforte - 14 Sep 2014 | update reference to CERN Grid CA page| | Main.StefanoBelforte - 20 Aug 2014 | remove reference to gLite UI | | Main.JohnStupak - Mar 2013 | review with minor changes | | Main.NitishDhingra - 28-Mar-2012 | See detailed comments below | | Main.StefanoBelforte - 22-Dec-2009 | Complete Expert Review, minor changes| | Main.FrankWuerthwein - 04-Dec-2009 | Complete Reorganization 1st draft ready for review| | Main.AndreaSciaba - 30 Nov 2009 | Minor corrections (removed or replaced broken links) | | Main.SimonMetson - 30 Apr 2009 | Updated the link to request a certificate (after a question from a user advice from Andrea Sciaba | | Main.MattiaCinquilli - 24 Nov 2008 | added explicit commands to setup the certificate | | Main.AndreaSciaba - 24 Jan 2008 | review with updated links and minor changes | | Main.StefanoLacaprara - 16 Nov 2006 | review with minor changes | | Main.AnneHeavey - 03 Aug 2006 | fairly substantial edits to Grid info | %TWISTY{mode="div" showlink="Detailed comments 28-Mar-2012 " hidelink="Hide " firststart="hide" showimgright="%ICONURLPATH{toggleopen-small}%" hideimgright="%ICONURLPATH{toggleclose-small}%"}% Review with minor additions in the grid certificate set-up instructions. The page accomplishes its goal. %ENDTWISTY% %RESPONSIBLE% Main.StefanoBelforte %BR% %REVIEW% Main.David L Evans - fill in date when done -
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r74
<
r73
<
r72
<
r71
<
r70
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r74 - 2021-03-11
-
MargueriteTonjes
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Offline Workbook
Glossary/Index
Summary of Changes
Site Map
Print as PDF
For Contributors
User Support
Offline SW Guide
Reference Manual
Online WkBk
ESSENTIALS
Preface
0.1 Acknowledgements
0.2 Using the Workbook
0.3 Goals & Detector
1. Accounts & Registration
1.1 Introduction
1.2 Get an Account
1.3 Set Computing Env
1.4 CMS Computing
1.5 Resources/Help
1.6 First visit to CERN
2. Basics of Offline
2.1 Introduction
2.2 Computing Model
2.3 CMSSW Framework
3. Getting Started with Data Analysis
3.1 Overview
3.2 Which release
3.3 Exploring Data
-3.3.1 Copy and Merge Files
-3.3.2 EDM Tools
3.4 Fireworks Visualization
-3.4.1 User Guide
-3.4.2 Full Framework
-3.4.3 Geometry
-3.4.4 Particle Flow
-3.4.5 Projections
-3.4.6 Problems
-3.4.7 Archive
3.5 FW Lite
-3.5.1 Getting Started
-3.5.2 Event loop
-3.5.3 Examples
-3.5.4 FW Lite in Python
4. CMSSW in Data Analysis
4.1 Introduction
-4.1.1 More on CMSSW
-4.1.2 Write Analyzer
-4.1.3 Intro Config Files
-4.1.4 Config Editor
4.2 PAT
-4.2.1 PAT Data Formats
-4.2.2 PAT Workflow
-4.2.3 PAT Config
-4.2.4 PAT Tutorial
--4.2.4.1 PAT Docs
--4.2.4.2 Create PAT Tuple
--4.2.4.3 PAT Tuple on Grid
--4.2.4.4 Analyze PAT Cands
-4.2.5 PAT on Data
-4.2.6 PAT Glossary
4.3 Candidate Utilities
4.4 Gen Evts in AOD
4.5 MC Truth Matching
4.6 Access Trigger Info
4.7 MiniAOD Data-Tier
4.8 NanoAOD Analysis Documentation
5. Distributed Data Analysis
5.1 Chapter Overview -- Getting Started
5.2 Grid Computing Context
5.3 Analysis Workflow
5.4 Locating Data
5.5 Data Quality Monitorin
5.6 Analysis with CRAB
5.6.1 CRAB Tutorial
5.7 Data analysis with CMS Connect
5.8 Monitoring with CMS Dashboard
5.9 The Role of T2
5.10 Transfering Data
5.11 Data Organization Explained
5.12 Processing by Physics Groups
5.13 cmssh tutorial
5.14 Using Xrootd Service for Remote Data Accessing
Appendices
A.1 Remote Site Info
A.2 Troubleshooting
A.3 Linux Basics
A.4 ROOT Basics
A.5 SCRAM Intro
A.6 BuildFile Intro
A.7 C++ Basics
ADVANCED TOPICS
6. Event Gen & Sim
6.1 Gen-Sim-Dig Intro
6.2 Generation
6.3 Simulation&Digi.
6.4 Reconstruction
6.4 Fast Simulation
7. Physics Object Analysis Examples
7.1 Introduction
7.2 Track Analysis
7.3 Vertex Reconstruction
7.4 Electron Analysis
7.5 Photon Analysis
7.6 Jet Analysis
7.7 MET Analysis
7.8 Global Muon Reco
7.9 B Tagging
7.10 Tau PF Tagging
7.11 Particle Flow
8. Physics Group Analysis Examples
8.1 B Physics MiniAOD
8.2 Electroweak PAT
8.3 Exotica PAT
8.4 Higgs PAT
8.5 QCD PAT
8.6 Top PAT
8.7 CMS Data Analysis Schools
8.8 CMS Physics Object Schools
9. Advanced Tools & Tasks
9.1 Introduction
9.2 EDM Utilities
9.3 EDM Containers
9.4 Common Data Types
9.5 Write EDM Producer
9.6 Pick Events
10. Software Infrastructure
10.1 Install Software
10.2 Develop Software
10.3 Code Optimization
For Contributors
0.4 Contributors' Guide
Page Template
Review for Printing
To Do List
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback