TWiki
>
CENF Web
>
Computing
>
EHN1Computing
>
NeutrinoPlatformEHN1ClusterComputingModelForUseByProtoDUNEs
(2018-03-06,
NectarB
)
(raw view)
E
dit
A
ttach
P
DF
---+ *Neutrino Platform EHN1 Cluster Computing Model For Use By !ProtoDUNEs* ---++ Overview ---+++ Definitions * Data Taking Operations during beam time (includes taking Beam generated and Cosmic Ray generated data at the same time) * Cosmic Ray Operations sustained running taking Cosmic Rays * Commissioning taking ready software and hardware and getting it ready for Operations * Analysis Operations Phase - After data taking and cosmic ray operations Processing Offline Operations and Analysis activities take place in each of these phases. ---++ Introduction to the EHN1 Neutrino Platform Cluster ---+++ Compute resources For more information please follow this [[https://twiki.cern.ch/twiki/bin/view/CENF/EHN1Computing#Computing_Infrastructure][link]]. Goal: | | # of Racks | # of Nodes/Rack | # of cores || | Odd Numbered | 6 | 23| 138 || | Even Numbered Full | 4 | 24 | 96 || | Rack08 | 1 | 18 | 18 || | Total | 11 | 252 | 2016 || Dec 2018: | Cores Total now |. 1847 | Hosts up | 232 | Hosts Down | 20 | | Cores Final | 2016 | some trays in the Rack08 are missing nodes. | ---+++ Validation and Metrics Validation and Metrics for use of the Cluster are useful and need to be defined and collected: * Nagios and [[https://twiki.cern.ch/twiki/bin/view/CENF/EHN1Computing#Ganglia_Cluster_Monitor][Ganglia]] plots are presented on the web regularly. * !ProtoDUNE-DP/NP02 runs benchmarks after any change in the software. * !ProtoDUNE-SP/NP04 depends on the centralized DUNE Continuous Integration system. It will be useful if the [[https://twiki.cern.ch/twiki/bin/view/CENF/EHN1Computing#Specs][EHN1 Cluster]] is included as a test site for this. ---++ EHN1 Cluster Configuration and Usage Model ---+++ Accounts During !BeamTime/Cosmic/ operations will have a np04-dataprod service account (and e-group) as a privileged account with a few administrative users only. Each account will have an associated description of the use Outside of these times can add some job queues to allow additional users, mapped through the DUNE VO, to use the resources. ---++ NP02 Model of Use ---++ NP04 Model of Use NP04 will let NP02 use their share of the NP EHN1 Cluster during NP02 Commissioning, Beam and Cosmic Data Taking. (An agreement is in progress between the 2 experiments for an equivalent number of slots on the Tier-0 to be made available from the NP02 share for use by NP04. ) Given current input from NP02 the dates of NP04 relinquishing use of the EHN1 NP cluster will be from June 1 to Dec 1 2018. (PLEASE CHECK/ADD here..) Given current HEPSPEC benchmarks we expect to discuss a ratio of about 1 Tier-0 core day to 3 EHN1 NP Cluster Core day equivalent from DP on the Tier-0. Outside of these dates NP04 has asked the DUNE Software and Computing group to include the EHN1 Computing Cluster as part of the transparently usable distributed offline facility. NP04 will work with NP and DUNE S&C on how best to accomplish this and make use of the EHN1 NP Cluster resource. (PLEASE CHECK if this is true: at the moment the Torque job management system currently preferred by NP02 is not supported as part of the DUNE S&C distributed offline facility) ---++ Operations and Training - initial thoughts The Joint Data Challenge (currently scheduled for April 9 2018) includes a component for Operations. EHN1 NP Computing Cluster is expected to be part of the that team which will work to define and then exercise a model for support and operations for the data taking. The team is currently coordinated by the DUNE S&C Coordinators - Andrew Norman and Heidi Schellman. DUNE S&C provides training materials and sessions at Collaboration meetings. Need to input to them for EHN1 NP Computing Cluster. As input: Need to do training how to use the overall distributed computing infrastructure including all clusters. This group and task will be responsible for release preparation; support for data preparation and physics analysis - e.g would like to have some extra cores and/or nodes for doing for analysis. What release do they want to use and how to include in the input path. Individual things working order like that. * Where is the s/w infrastructure - where are the releases. * Where are the validations.. * Need everything automatic and can be done with a press of a button %ICON{arrowbup}% [[https://twiki.cern.ch/twiki/bin/view/CENF/EHN1Computing][Back to EHN1 Computing Main Page]] %ICON{arrowbup}% [[https://twiki.cern.ch/twiki/bin/view/CENF/Computing][Back to CENF-Computing Main Page]]
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r4 - 2018-03-06
-
NectarB
Log In
CENF
CENF Web
Create New Topic
SiteMap
Index
Search
Changes
Notifications
Statistics
Preferences
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CENF
All webs
Copyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback