ATLAS Northeast Tier 2/3

The ATLAS Northeast Tier 2 (NET2) cluster is housed at the Massachusetts Green High Performance Computing Center MGHPCC. The Tier 3 cluster (NET3) used at UMass is a collaboration with Boston University and Harvard and benefits from sharing much of the same infrastructure as NET2.

New Users

First read the details and rules found on http://egg.bu.edu/net3/. For shared computing resources such as NET3 to function smoothly it is important for all users to be responsible and understand how their usage can impact others on the cluster.

To get access to the Slack workspace (https://northeasttier3.slack.com/), contact Rafael. Troubleshooting happens in a public way on Slack, providing an evolving FAQ for all NET3 users. This is also how we communicate with BU/Harvard, and you will need access to request an account.

In order to receive an account for the cluster, you must provide a public ssh key. If you do not have one, they are easy to generate.

Contact Augustine or Saul on the NET3 Slack workspace with your ssh public key and desired account name to receive an account.

Once you have an account you may connect with the following command: ssh username@umNNOSPAMPLEASE.net3.mghpcc.org where N specifies the node, a number between 1 and 3. No password is required, as authentication is handled via the ssh key. You may connect to the cluster from any network as long as you keep the same public ssh key on your machine.

Setting Up Your Environment

The first time you log in, you will need to edit your .bash_profile file in your home directory. To have the ATLAS software available to you upon login, add the lines

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase

alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'

Each time you log in, you must run the command setupATLAS

Batch System

Manual (with qsub)

Batch jobs are submitted with the command qsub -q tier3 script.sh, where script.sh contains the commands you wish to be executed.

More detailed information can be found in the brains of your fellow UMass ATLAS group members, as well as the qsub man pages.

Automatic (with EventLoop)

If part of your work uses EventLoop, you can use the pre-defined EL::GEDriver class to submit batch jobs with no extra effort. See this code for example usage.

IMPORTANT: you must include a line like the following to specify the 'tier3' queue:

job.options()->setString(EL::Job::optSubmitFlags, "-q tier3");

If you run into issues where code that runs interactively fails on the workers, try adding -V to the above command. I.e.

job.options()->setString(EL::Job::optSubmitFlags, "-V -q tier3");

This will copy all of your current environment variables to the worker nodes.

Batch jobs with Parallel Processing

Submitting a batch job with parallel processing/multithreading may overwhelm the tier3 nodes if the number of threads specified is greater than the number of processors per node. Instead, jobs must be submitted with a specified "parallel environment", which will distribute the threads over several nodes. This is done with the -pe smh option, with a full example of

qsub -pe smp 40 -q tier3 MT_script.sh

Setting a parallel environment will likely cause the user to create several jobs, so discretion should be used as to not submit an excessive amount of total jobs.

Sample Storage and RSEs

There are two NET2 rucio storage elements you may use: NET2_LOCALGROUPDISK, NET2_SCRATCHDISK. For long-term storage of large samples use NET2_LOCALGROUPDISK. For short-term storage of samples one may use NET2_SCRATCHDISK, but be aware that these samples will have a limited lifetime on the cluster.

Finding requested samples locally

Samples requested to NET2 through rucio rules can be accessed locally at /gpfs3/{atlaslocalgroupdisk,atlasscratchdisk}, but they are sorted in a directory tree by md5sum which is not human readable. A script (/home/net3/zmeadows/scripts/get_NET2_local_filepaths.py) exists to help in this regard. Given an RSE and a sample name, it will create a new folder in the current directory containing symlinks to all local NET2 files from the specified sample. Run the script without arguments for example usage. You may have to run 'lsetup rucio' and/or 'lsetup python' for this script to work properly.

Using Gitlab on NET3

To use git, simply run lsetup git. If you are trying to interact with a gitlab repository and encounter the error HTTP Basic: Access denied, you may need to generate a kerberos token first. Run kinit and when prompted, enter your CERN password. If you are having trouble cloning a repository, try selecting the KBR5 security protocol from the drop down menu at the top of the repository page.

Python Virtual Environments

In order to work with custom python scripts/projects, one often needs to install and use libraries beyond the default python system libraries. To do so, begin by setting up a system wide python install using one of lsetup python, lsetup root, asetup, etc. To move from here to a user-specific setup you must use virtualenv, a tool for solving this very issue of user-specific libraries and dependencies.

1. make sure virtualenv is install: python2.7 -m pip install --user virtualenv

2. create the virtual environment inside your project directory (or wherever you like, but choose wisely): python2.7 -m virtualenv myenv, which myenv is your chosen name for the environment.

3. Activate the environment: source ./myenv/bin/activate

4. Now a special python environment exists within myenv and is activated. Try which python and which pip to see. Now you may install python libraries with pip and import them. Just make sure you are using the properly located python and pip, and not a system wide python2.7, for example.


revisions:
r8 - 2019-06-04 - 12:53 - DaleCharlesAbbott
r7 - 2018-11-17 - 20:14 - ZacharyAldenMeadows
r6 - 2018-10-19 - 17:40 - ZacharyAldenMeadows
r5 - 2018-08-08 - 20:31 - StephaneWillocq
r4 - 2018-03-30 - 22:40 - ZacharyAldenMeadows
r3 - 2018-03-30 - 21:50 - JacksonCarlBurzynski
r2 - 2018-03-30 - 16:45 - JacksonCarlBurzynski
r1 - 2018-03-30 - 01:56 - ZacharyAldenMeadows
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2019-06-04 - DaleCharlesAbbott
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback