ADCContainersDeployment

Introduction

Containers are a solution for several operational problems on the grid WNs.

  • Installation of different OS from SL/RHEL/CentOS
  • OS upgrades don't need coordination with experiments anymore
  • Minimal installation on the nodes if sites prefer
  • Allows experiment to run tests with specific software or setups
  • May offer another approach to software distribution to sites that don't support CVMFS
  • Simplifies running on HPC resources
  • Payload isolation
  • Data and Software preservation
  • User containers deployment

Operating System

  • Most of the features work only on CentOS7 family of OS not all features work in every version
    • CentOS7.2 or earlier: it can work with some particular site setup. Not recommended!
    • CentOS7.3: overlay (allows bind paths when the path doesn't exist in the image) enabled (see Required Parameters)
    • CentOS7.4 possibility to run singularity without setuid. It will require some changes to the /etc/default/grub files that can be added at installation time. More will be added when the situation is better defined. Still to be tested by ATLAS (some testing done by CMS)
    • CentOS7.6 user namespaces become production still need to be enabled
    • CentOS8 user namespaces are enabled by default

Pilot setup

Containers will be enabled with Pilot2. Pilot 2 is single-process multi-threaded and it is only possible to execute individual commands in containers. So payloads, stage-in, stage-out and other commands will be executed in different containers. Pilot2 calls singularity using ALRB which wraps the image and sets up the environment for the job to run. For user containers instead it simply runs the container without doing anything else, this as of January 2019 also works with pilot1 at CentOS7 sites with singularity installed.

14/8/2019: pilot2+singularity are currently being commissioned. There are currently 230k slots running containerized payloads.

AGIS setup

  • AGIS container parameters: container_type and container_options are used by the wrapper and pilot to decide what to do.

    Example

    container_type: "singularity:pilot" 
    container_options: "-B /mnt/lustre_2 --nv"
    
    These can be both left empty and the wrapper and the pilot will act accordingly running a standard job.

    If only container_type is filled ALRB will try to guess the bind points from the environment ($PWD, $HOME, $TMPDIR), this is the best way for sites to run as it doesn't require for the site to edit the PQ with its internal details however if you want to add options you need to set container_options, then the additional options will be passed to singularity via ALRB

Singularity configuration at sites

ATLAS can run with different singularity configurations in particular if underlay is enabled it can run either with setuid or with non-setuid executables. Below three recommended setups, a way to test and after that some important distinctions in case you need an explanation.

Recommended setup to use singularity from /cvmfs with user name spaces and no setuid

ATLAS preferred method even if singularity is locally installed for other users. This setup has become also an WLCG/EGI/OSG recommendation in May 2020

  • CentOS7.6 (or kernels >=3.10-957) or later
  • Enable user namespaces
    echo "user.max_user_namespaces = 15000" > /etc/sysctl.d/90-max_user_namespaces.conf
    sysctl -p /etc/sysctl.d/90-max_user_namespaces.conf

Recommended setup if you want to install singularity with user namespaces and no setuid

  • CentOS7.6 (or kernels >=3.10-957) or later
  • yum -y install singularity
  • Enable user namespaces
    echo "user.max_user_namespaces = 15000" > /etc/sysctl.d/90-max_user_namespaces.conf
    sysctl -p /etc/sysctl.d/90-max_user_namespaces.conf
  • Singularity config
    allow setuid = no
    enable overlay = no
    enable underlay = yes

Please DO NOT install only the runtime. Install the whole package. We need to build the sandbox for some workflows
Please DO NOT limit the owner or the source of the images, it will cripple the users who want to use their images

Recommended setup if you want to install singularity with setuid and no user namespaces

  • CentOS7.3 or later
  • yum -y install singularity
  • Singularity config: rpm default

Please DO NOT install only the runtime. Install the whole package. We need to build the sandbox for some workflows
Please DO NOT limit the owner or the source of the images, it will cripple the users who want to use their images

Test

You can check everything is working if you can run the containers TRF to run a hello world as a non privileged user

wget https://raw.githubusercontent.com/PanDAWMS/panda-wnscript/master/src/runcontainer/runcontainer
python ./runcontainer -p "echo 'Hello World'" --containerImage docker://centos:7
aforti@vm26>python ./runcontainer -p "echo 'Hello World'" --containerImage docker://centos:7
2019-09-20 15:02:50,159 | INFO     | runcontainer version: 1.0.22
2019-09-20 15:02:50,159 | INFO     | Start time: Fri Sep 20 15:02:50 2019
2019-09-20 15:02:50,160 | INFO     | Start container time: Fri Sep 20 15:02:50 2019
2019-09-20 15:02:50,160 | INFO     | Singularity command: singularity build --sandbox /home/aforti/docker_centos_7/image docker://centos:7
2019-09-20 15:03:11,443 | INFO     | 
2019-09-20 15:03:11,444 | INFO     | Deleting /home/aforti/docker_centos_7/cache.
2019-09-20 15:03:11,470 | INFO     | No input files requested
2019-09-20 15:03:11,470 | INFO     | User command: echo 'Hello World'
2019-09-20 15:03:11,470 | INFO     | Singularity command: singularity exec --pwd /data -B /home/aforti:/data  /home/aforti/docker_centos_7/image /data/_runcontainer.sh
2019-09-20 15:03:11,601 | INFO     | Hello World
2019-09-20 15:03:11,602 | INFO     | 
2019-09-20 15:03:11,602 | INFO     | End container time: Fri Sep 20 15:03:11 2019
2019-09-20 15:03:11,602 | INFO     | End time: Fri Sep 20 15:03:11 2019
rm -rf  docker_centos_7

if you want to test using singularity in CVMFS, set the PATH before running the runcontainer script

export PATH="/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin:$PATH"

You can also check ALRB with containers work by doing the following

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase;
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh --quiet
setupATLAS -c sl6
Singularity>

also ALRB picks the first singularity in the PATH so you can test it with local or with CVMFS singularity by changing the PATH as above.

Some explanation about different settings

Local singularity installation vs installation in /cvmfs

Since mid July 2019 (as presented at the ADC weekly 16/7/2019) ATLAS has the singularity executable in its CVMFS repository and can use that at sites that do not want to maintain a local installation. Singularity in CVMFS is configured to run in non-setuid mode and can only be used at sites that have enabled user namespaces (see below). The pilot will check for a local installation and if it doesn't find will fall back to run singularity from CVMFS.

Singularity version

Supported: 3.5.3, 3.2.1, 2.6.1
Not supported: 3.4.x

Some explanations

  • 2.6.1 is more lightweight for certain use cases, i.e. it doesn't build a SIF image when executing from an OCI registry, it builds a temporary sandbox
  • 3.2.1 can support nested containers and allow ATLAS to run pilot containers at sites that run singularity as part of the batch system.
    • To avoid building the SIF image when using v3 ATLAS now explicitly builds the temporary sandbox in its scripts when executing from an OCI registry
  • 3.4.0 FAILS not supported.
  • 3.5.3 Allows to not cache the layers which is useful when building a sandbox on the fly

setuid vs non-setuid

ATLAS will not require one over the other because different sites have different requirements. ATLAS can run with both.The caveats here are the following

User namespaces

  • Singularity without setuid requires user name spaces to be setup
  • Singularity will not use user namespaces if allow setuid = yes and the setuid executables are present even if user namespaces are configured in the kernel and enabled with sysctl as indicated below.

Overlay vs underlay

ATLAS needs the ability to mount directories that do not exist in the images for some of its workflows. Since 2.6.0 there are two ways of doing this in singularity overlayfs and underlay

  • overlay is a privileged operation that requires setuid and cannot be used if singularity uses user name spaces.
  • underlay is an unprivileged operation. It will work in either mode, ie. whether setuid is set or not.

It is racommended to enable underlay because it works in either mode. If you have singularity >=2.6.0 and underlay is not in the configuration the configuration file is old (it might typically happen when the config file has been transformed into a template in puppet)

Enable user name spaces

If you want your site configured without setuid you need to do the following

In CentOS7.6 (or kernels >=3.10-957) user namespaces are enabled by default in the kernel. i.e. you should find in /boot/config-*

CONFIG_USER_NS=y

but they are not usable by default. To make user namespaces usable by singularity you need to do the following on ALL WNs

echo "user.max_user_namespaces = 15000" > /etc/sysctl.d/90-max_user_namespaces.conf
sysctl -p /etc/sysctl.d/90-max_user_namespaces.conf

If you don't have a local singularity installation and ATLAS uses the one in CVMFS this is all you have to do.

If you have a local singularity installation you also need to modify /etc/singularity/singularity.conf

allow setuid = no
enable overlay = no
enable underlay = yes

Without user name spaces

If you still want to use setuid - because maybe you have other users that need it - the standard configuration in 2.6.1 and 3.2.1 works for ATLAS.

Image creation

All ATLAS images are created in using Docker. The ADC images are then put in CVMFS. User images may or may not be put in CVMFS depending on their popularity.

The definitions of the ADC containers (Dockerfiles) are available from GitHub, categorized by the source OS type and its major version. The Docker containers are automatically built and available from DockerHub.

The Singularity images are created on-demand on the CVMFS management machine at CERN, by importing the corresponding Docker container. The images are stored as monolithic files in /cvmfs/atlas.cern.ch/repo/containers/images/singularity and unpacked in /cvmfs/atlas.cern.ch/repo/containers/fs/singularity.

The images in CVMFS and the user images are going to be built from the same OS base image as docker layers hierarchy.

Both the monolithic images and their unpacked version have a name following the convention [os_type]-[date]-[hash], where [hash] is the md5sum of the original image. For convenience the latest images are always linked to files with the convention [machine_type]-[os_type], for example x86_64-centos6. Users should always access the linked images, to make sure they use the latest version, unless there are good reason to access an old image. The monolithic images are provided for limited usage only, users should always access the ATLAS Singularity containers via their unpacked version, in order to limit the load on the CVMFS servers.

Image distribution

Currently ATLAS distributes single file and unpacked images via CVMFS. The paths where they can be found are

  • Single file: /cvmfs/atlas.cern.ch/repo/containers/images/singularity/
  • Unpacked: /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/
  • WLCG common repo: /cvmfs/unpacked.cern.ch/ Work in progress, under construction
  • docker hub: docker://<path-to-your-image>
  • gitlab: docker://gitlab-registry.cern.ch/<path-to-your-image>

HPC sites will likely still need *fat images* i.e. images that not only contain the OS but also the some of the ATLAS software. The OS part of the fat images comes from the above images in CVMFS. NERSC is currently running in this mode. ATLAS is working on building fat images as standalone images without pulling things from CVMFS. These images will be available in docker hub.

Restricting the source of the images at the site will cause jobs to fail, ATLAS is trying to group the images in well defined areas, but physics groups might create other areas in the future. Sites that want this restriction should make an effort to update the ACLs. In the User images document we foresaw to restrict to some areas. The current ones are these:

  • docker://gitlab-registry.cern.ch/foo/bar (CERN registry when available)
  • docker://library/* (Official images curated by Docker)
  • docker://atlas/* (ATLAS software base images)
  • docker://atlasadc/* (ATLAS ADC images)
  • docker://atlasml/* (ATLAS Machine Learning base images)
  • docker://<expert-users>/* (ATLAS expert users)

Work is also ongoing to get these images in the unpacked repository and to integrate registries with /cvmfs.

Standalone containers

standalone containers that don't need /cvmfs to work because all the software is containedin the image have important use cases in ATLAS.

User containers

One of the most important use cases is the user standalone images which satisfy the analysis preservation requirements. Users create their images using Docker (related docs AtlasDocker) and can submit them to the grid using prun (or in the future pcontainer). The plan on how to run them is in the User images document.

Current status (18/07/2019)

  • runcontainer: container TRF development is tracked in ADCINFR-101. Now that it is a fully fledged TRF it works with both pilot1 and pilot2 at CentOS7 sites with singularity installed. runcontainer will take the options and build the singularity command line and take care of what needs to be fed in or taken out of the container (input, output, proxy, env vars,...).
  • prun: the users can use prun to run this kind of containers. Some options needed to run container with prun and runcontainer atm
    • --noBuild: with containers there is no need for this stage and without this option the job will fail
    • --forceStage: runcontainer doesn't do yet directIO which is the default for the analysis queues. This option will tell the pilot to download the input on disk.
    • --forceStageSecondary: same as abvove but for secondary datasets
    • --containerImage: this will select runcontainer TRF instead of runGen and will take the docker image as an argument.
      • Images from docker hub or public images from gitlab can be used see Image distribution above for the format.
    • dummy file: even if --noBuild is given prun still requires something to upload to panda. It is enough to touch an empty file to make it work

To run a simple Hello World! test

touch dummy
prun --containerImage docker://alpine --exec "echo 'Hello World\!'" --tmpDir /tmp --outDS user.turra.test.$(date +%Y%m%d%H%M%S) \
--noBuild --site ANALY_MANC_SL7

This should end with a stdout like this

https://aipanda167.cern.ch/media/filebrowser/023bc834-f8f3-4274-b45c-33039307e29a/user.aforti/tarball_PandaJob_4230475045_ANALY_MANC_SL7/athena_stdout.txt

The command above is used right now in the Hammercloud to test basic functionality, not yet all functionality. You can select this particular Hammercloud test with this URL

https://bigpanda.cern.ch/jobs/?processingtype=gangarobot-container

GPUs

The standalone containers above are also the basis to run on GPUs

Current queues can be found in AGIS.

Further information is going to be in the GpuDeployment page.

HPC

Deployment at HPC sites is also a special case of standalone containers though instead of being user containers they are production ones. Description of each needs to be documented ContainersOnHPC. The ATLAS Release Containerization Task Force is developing the automated procedure for creation of standalone containers for ATLAS software releases, particularly for the use at HPC sites.

Kubernetes

KubernetesDeployment page

Use cases sites matrix

Some use cases may overlap at some sites: for example HPC resources don't support CVMFS and require minimal installation on the nodes. However some other sites may have minimal installation but support CVMFS (i.e. they can install cvmfs but not the grid middleware). So while the solution might be similar eventually they are not exactly the same.

Id Use case Container (*) Container Deployment place (*) Sites Number of sites
1 Installation of different OS from SL/RHEL/CentOS any any any >100
2 OS upgrades don't need coordination with experiments anymore singularity (**) experiment needs to run this either at wrapper level or at pilot level to maximise the benefit any >100
3 Minimal installation on the nodes if sites prefer any any, but needs an image already setup by common project ATLAS or WLCG any minimal WN site few
4 Allows experiment to run tests with specific software or setups singularity has to run inside the pilot (or the wrapper) to get the right environment from sched config any >100
5 May offer another approach to software distribution to sites that don't support CVMFS singularity (**) fat image HPC ~5 (number increasing)
6 Reduces the impact of ATLAS software on large shared file systems on HPC resources singularity (**) fat image for software distribution HPC ~5 (number increasing)
7 Payload isolation any need to run inside the pilot to isolate multi payloads and isolate the directories where the payload runs to separate payload from the pilot environment any >100
8 User containers any they shouldn't run anything other than their content so they can only run as part of the pilot any >100
9 GPUs singularity (**) can run custom software any 4
10 Benchmarking suite any similar to user containers, the can also be run by sys admins manually any > 100

(*) "any" means either docker, shifter or singularity
(**) singularity or in future other rootless runtimes
(*) "any" means either batch system, wrapper or pilot (we assume users will not run containers as part of the payload).

Sites running containers as part of the batch system

It is possible to integrate containers in the batch system or as a complete WN. So far singularity within docker has been proved to work. IBM Platform (aka LSF) batch system also provide seamless/transparent integration with Docker/Shifter/Singularity. Here are a couple of presentations about two ways of running docker at two sites using condor and one using singularity integrated with slurm.

Since version 3.0.0 singularity can do nested containers and ATLAS can run at sites deploying singularity as part of the batch system. Currently (18/07/2019) this is being tested at some sites.

Meetings and presentations and other docs

Meetings

Docs and twikis

Contact

egroup: adc-atlas-containers-deployment@cernNOSPAMPLEASE.ch


Major updates:

-- AlessandraForti - 2017-07-19

Responsible: AlessandraForti
Last reviewed by: Never reviewed

Edit | Attach | Watch | Print version | History: r72 < r71 < r70 < r69 < r68 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r72 - 2020-06-28 - AlessandraForti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Atlas All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback