ARC site Installation for ALICE (and not ALICE) and WLCG

Requirements

  • Standard requirements for WLCG and ALICE site
  • CentOS 7 for head node and worker nodes
  • Shared disk space between worker nodes and main server
  • Torque Batch System (or another LRMS, but torque is used as an example here)

Links to ARC manuals:

Welcome to ARC Version 6!

ARC Data Services Technical Description

ARC Configuration Reference Document

Packages

Nordugrid 6.5.0

Package installation

For ARC-CE

 # yum -y install epel-release https://centos7.iuscommunity.org/ius-release.rpm https://download.nordugrid.org/packages/nordugrid-release/releases/6/centos/el7/x86_64/nordugrid-release-6-1.el7.noarch.rpm http://repository.egi.eu/sw/production/umd/4/centos7/x86_64/updates/umd-release-4.1.3-1.el7.centos.noarch.rpm 
 # yum install http://linuxsoft.cern.ch/wlcg/centos7/x86_64/wlcg-repo-1.0.0-1.el7.noarch.rpm -y 
 # yum -y install nordugrid-arc6-compute-element nordugrid-arc6-arex nordugrid-arc6-plugins-gridftpjob ca-policy-egi-core  wlcg-voms-alice wlcg-voms-atlas wlcg-voms-ops    nordugrid-arc6-plugins-lcas-lcmaps nordugrid-arc6-plugins-globus nordugrid-arc6-gridftpd nordugrid-arc6-arcctl

For Site-BDII

 # yum install bdii-config-site.noarch bdii

Additional patches

 # diff -u /usr/share/arc/submit-pbs-job.save /usr/share/arc/submit-pbs-job
--- /usr/share/arc/submit-pbs-job.save  2020-04-28 01:20:43.012725899 +0300
+++ /usr/share/arc/submit-pbs-job       2020-04-28 01:21:44.942806477 +0300
@@ -217,7 +217,7 @@
 fi

 if [ ! -z "$memreq" ] ; then
-  echo "#PBS -l vmem=${memreq}mb" >> $LRMS_JOB_SCRIPT
+  echo "#PBS -l mem=${memreq}mb" >> $LRMS_JOB_SCRIPT
 fi

 gate_host=`uname -n`
  • Torque log scan filter (pull request submitted upstream)
In the file /usr/share/arc/scan-pbs-job check this line:
exited_killed_jobs=`egrep '^[^;]*;0010;[^;]*;Job;|^[^;]*;0008;[^;]*;Job;[^;]*;Exit_status=|^[^;]*;0008;[^;]*;Job;[^;]*;Job deleted' ${lname} | tail -n+$(( $lines_skip + 1 ))`
And change it to:
exited_killed_jobs=`egrep '^[^;]*;(0010|16);[^;]*;Job;|^[^;]*;(00)?08;[^;]*;Job;[^;]*;Exit_status=|^[^;]*;(00)?08;[^;]*;Job;[^;]*;Job deleted' ${lname} | tail -n+$(( $lines_skip + 1 ))`
 

ARC.conf

example arc.conf

Shared folders

  • shared_scratch
  • sessiondir
  • cachedir

Define ARC accounts pool:

vo=alice
mkdir -p /etc/grid-security/pool/$vo
for u in ${vo}{001..50}; do echo $u >> /etc/grid-security/pool/$vo/pool; done  

Configure and start services (ARC-CE)

arcctl service enable -a
arcctl rte enable ENV/PROXY
arcctl rte enable -d ENV/GLITE
arcctl service start -a
systemctl start fetch-crl-cron
For ATLAS only:
arcctl rte enable -d APPS/HEP/ATLAS-SITE-LCG

Configure site-BDII (example files)

/etc/bdii/gip/glite-info-site-defaults.conf

/etc/bdii/gip/site-urls.conf

/etc/glite-info-static/site/site.cfg

Start BDII service

systemctl enable bdii
systemctl restart bdii

Firewall

  • Main server
    • 2811/tcp, 2170/tcp, 2135/tcp, 6443/tcp, 9000-12000/tcp

Troubleshooting

Fast problem:

  • RTE - need check RTE for your VO
  • LRMS limit - check limit (cputime, memory etc.) in your LRMS system
  • CVMFS - check cvmfs on nodes
  • Permission and access for work folders (on nodes and server).
  • Open ports for gridftp (9000-12000/tcp as example)

-- AndreyZarochentsev - 2020-07-06

Edit | Attach | Watch | Print version | History: r14 < r13 < r12 < r11 < r10 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r14 - 2021-04-28 - AndreyZarochentsev
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback