Documentation for using Abisko cluster for the SNIC 2019/3-68 computing project

Introduction

This page contains instructions for how to use the computing resources granted via the SNIC 2019/3-68 project "Particle Physics at the High Energy Frontier". The project grants 100k CPUh/month for the KTH, Lund, Stockholm and Uppsala ATLAS groups, and the Lund ALICE group. This page contains the ATLAS-specific documentation for how to use these resources at the Abisko cluster which is part of HPC2N.

Collecting this documentation is work in progress and contributions are most welcome, in particular after galning experience with the job submission tools, etc. please feel free to use the egroup mentioned below for discussions.

For old info about the previous project that was active for a year from Feb 1, 2018, see SNIC 2018/3-38.

Getting started

This section explains what you need to do to get set up with an account at HPC2N, to check your affiliation with the project, and set up your environment. The two expandable subsections show instructions specific to ATLAS and ALICE users.
        t-an01:~ > projinfo
        Project info for all projects for user: cohm
        Information for project SNIC2019-3-68:
            Christian Ohm: Particle Physics at the High Energy Frontier
            Active from 20180130 to 20190201
            SUPR: https://supr.snic.se/project/SNIC2019-3-68
        Allocations:
            abisko:         100000 CPUhours/month
        Usage on abisko:
            N/A

  • Add the following to your login script (e.g. .bashrc) to automatically have setupATLAS available when you log in:
        # ATLASLocalRootBase
        export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
        alias setupATLAS='. ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'

  • TODO:
        ALICE commands

After you log out and back in, you should be set up to use the resources and experiment-specific tools.

Quick test of the setup on the interactive node

After logging out and back in, you can call setupATLAS to set up all the standard ATLAS software off of cvmfs, just like on lxplus, e.g. to set up the analysis release 21.2.16 just do:
setupATLAS -c slc6
asetup AnalysisBase,21.2.16,here

The -c slc6 switch directs setupATLAS to start up a SLC6 singularity container. Once in the container all the standard ATLAS software will work as if running on SLC6. NB! Once inside the container the batch system commands no longer work, as they are not compatible with SLC6.

In order to use the batch system from within the Singularity container use -c slc6+batch which also enables the use of batch system tools. For more information see here

  • TODO:
        ALICE commands

Submitting jobs to the batch system

The interactive login nodes should not be used for heavy computing jobs, and these should be run on the SLURM-based batch system Abisko at HPC2N. More complete instructions are available here, but below are some examples of basic commands, and simple examples for how to run jobs interactively and submit jobs to the batch system.

Examples of useful commands:

  • Show me my jobs: squeue -u cohm (replace with your user name)
  • Cancel a running job: scancel
  • Submit a job to the batch system: sbatch yourjob.slurm, where yourjob.slurm is a file describing your job (see below)

Running jobs interactively: reserve one batch node with four processors to work on interactively for one hour:

  1. Reserve the node: salloc -A 2 -N 1 -n 4 --time=1:00:00
  2. Run a program on the allocated node interactively: srun -n 1 my_program (this will wait until the program is done, and the shell will not be usable until the program is done)

Submit a program to the batch system: In order to submit a job to a node using ATLAS software the following can be used, with parameters example SBATCH parameters:

  #SBATCH -A SNIC2019-3-68
  # Name of the job (makes easier to find in the status lists)
  #SBATCH -J TestJob
  # name of the output file
  #SBATCH --output=test.out
  # name of the error file
  #SBATCH --error=test.err

  #SBATCH -n 1
  # the job can use up to 30 minutes to run
  #SBATCH --time=00:30:00
  export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
  
  export ALRB_SCRATCH="/pfs/nobackup/home/<initial>/<username>/"
  export HOME="/pfs/nobackup/home/<initial>/<username>/"
  
  export ALRB_CONT_RUNPAYLOAD="what you want to do"
  source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh -c slc6

where the ALRB_CONT_RUNPAYLOAD variable contains the commands to be executed after the singularity container has started (needed since we don't get an interactive prompt on the worker nodes). Assuming that the above script is saved under job.sh then it is possible to submit by running:

* sbatch test.sh

Storage

For more exhaustive info about the file systems at HPC2N, see this page. The most important parts are summarized here:
  • AFS: Your home directory is on afs and backed up regularly, and is accessible under /afs/hpc2n.umu.se/home/u/user/ (very similar to the CERN afs, so user should be replaced with your user name, and u with its first letter). One difference is that this is accessible under /home/u/user/ on the nodes (and that's also where $HOME points to). Your afs area is not accessible from batch jobs, so for that you should use PFS. You can check your quota using fs lq
  • PFS: This is the file system you should use for anything that needs to be read or written in your batch jobs, and your area can be found under /pfs/nobackup/home/u/user.
  • Quotas: you can check your quotas on the above two file systems from any directory like this:
        t-an01:~ > quota
        Disk quotas for cohm:
        Filesystem            usage       quota   remark
        /home/c/cohm       176.00MB      1.91GB   9.0% used
        /pfs/nobackup           4KB      2.00TB   0.0% used
          file count              1     1000000   0.0% used
For info about options for more substantial storage, please see Swestore (Christian has not investigated this further yet but is happy to discuss).

Getting help

The egroup atlas-sweden-analysis should be used to report and discuss problems using these resources. If you haven't already, please join here. If you have issues registering with SNIC or joining the project, please contact Christian Ohm directly (PI of the project).

Known issues

There are no known issues with using ATLAS software at Abisko at the moment, but since the cluster doesn't have an OS based on SLC6 there could be problems (see ATLAS s/w readiness for CentOS7 here). Please contact the egroup above if you have problems.

Documentation

-- ChristianOhm - 2019-02-01

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2019-02-01 - ChristianOhm
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback