WLCG Operations and Tools TEG - WG3 Software Mangement, Software Configuration and Deployment Management

Application Software Management and Configuration Management

The software stack for LHC experiments is layered into several levels. The bottom being the operating system (e.g. Scientific Linux), on top of it being a set of common libraries which are used together by the LHC experiments (e.g. ROOT, COOL, CORAL, Boost, mysql, Python, etc) and the top most layer being the experiment specific applications for data reconstruction, analysis and simulation. Essential needs for the experiments are

  • possibility to have a fast turnaround of package versions and their grid wide deployment
  • package version decoupling from operating system to allow later versions to be used than natively provided
  • provide the same package version on different OS, e.g. on other development platforms (Mac)
  • possibility to fix the package version, essential for reproducibility

The widest used operating system being "Scientific Linux (SL)" is provided by teams at Cern and Fermilab. The common software layer for LHC experiments is being steered by PH/SFT through the "LCG Applications Area" where the set of common libraries and their versions is being discussed and defined. The topmost layer, the experiment specific applications, are being developed individually within experiments. The software management and configuration of these layers for experiments is being driven by several tools, some being used in common by multiple experiments.

... missing paragraph on grid middleware ...

CMT

Cmt (http://www.cmtsite.org) is the main tool currently in use by Atlas, LHCb and PH/SFT. Cmt implements its own language through which it allows package versioning, package building and the setup of the corresponding runtime environment.

Usually 2 to 3 times a year the PH/SFT group is providing a so called "LCGCMT configuration" which denotes a baseline of ~ 120 common packages provided to experiments. The packages within such a configuration include the "self developed" projects (ROOT, COOL, CORAL), "external packages" - recompiled from source (e.g. gcc, Boost, Python, mysql, frontier) and grid middleware clients provided so far by gLite. These configurations are augmented by fix releases if needed (usually 5-6). Experiments using CMT are taking these LCGCMT configurations to build their specific applications on top. To allow an even faster turnaround of packages, LHCb has invented an experiment specific "LHCbGrid" configuration, which can overwrite the package versions and instructions provided by LCGCMT for grid middleware packages.

LCGCMT configurations provide a common baseline for LHC experiments with immutable package versions, decoupling the package versions from the operating system, allowing usually a faster turnaround of new package versions and also usage of versions later than provided by the underlying OS. Fixing package versions up to the patch level, not relying on OS package deployment which could be different for sites, which facilitates also reproducing of problems. "External packages" can be patched if needed, not relying on OS vendor. At the moment only a limited set of operating systems is available and certified. With more manpower more OS versions could be provided. A concern is the future of the grid middleware packages provided to PH/SFT after the move to EMI.

RPM

Spec files which are derived from templates are augmented with version numbers and are being used for package building. A set of spec files containing a global tag denote a self contained set of compatible versions for the CMS software stack. The "cmsbuild" script will subsequently use one of these global tags to build the binary rpms which are used for later deployment. The advantage of this solution is the usage of a standard tool which is available and known in the development community for long time. The dependency management is also handled by rpm in a standard way.

SCRAM

Scram is being used by CMS for building the experiment specific projects (CMSSW, Coral, Cool, ...). The dependent packages are being linked into the local area before the build process starts. Scram will then provide the compiling and linking of the binary products.

Gnu Make

CMake

currently under investigation by PH/SFT + experiments

Yaim

Deployment Management

CVMFS

Cvmfs is a centrally deployed filesystem that is distributed to remote users via levels of caches. The local cvmfs client will only download the files which are needed for performing its work, usually a fraction of the deployed software area. Software deployment is done by the librarians of the different VOs deploying their software to a "release node" from which the files are being replicated to a "Stratum 0" node. This is the root node for the software deployment. Several "Stratum 1" nodes are being attached to this root node (currently operational at Cern, Ral, BNL, coming soon at Fermilab, Taiwan) which provide additional copies of the software tree. The final replication to the sites is being done via squid caches to which the cvmfs client will talk to retrieve the necessary files and download them persistent into a locally mounted cache.

Currently cvmfs is deployed on ~ 80 sites and hosts 15 volumes (~ 2TB) for LHC experiments and other VOs. Atlas and LHCb are using the system in production on the grid. Cvmfs provides server side monitoring and a nagios probe for sites for their local testing.

The advantage of this system is that it facilitates the software deployment in a single place, therefore reduces the workload for software librarians and deployment people. In addition any changes made on the root Stratum 0 node are visible with very little delay on all the connected clients.

Tarballs

Basically provides individual tar balls for the experiment common software layers and individual projects. The deployment of the software is being done via special SAM jobs with privileges to write into the sites' shared software area. This system provides an easy way of software deployment both on grid sites and individual user machines. It is quite old but has been proven to be effective enough for software deployment on grid sites.

Packman

RPM / apt

A modified version of rpm / apt, to allow execution in user space, is being used for software deployment. This version provides its own rpm database which is disconnected from the operating system one. (how is the deployment on the grid done?)

Torrent

Additional Info

Contributors

Jakob Blomer, Marco Clemencic, Andreas Pfeiffer, Marco Cattaneo,

-- StefanRoiser - 09-Nov-2011

Edit | Attach | Watch | Print version | History: r18 | r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2011-11-11 - StefanRoiser
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback