-- JamieShiers - 07 Mar 2006

Status of Pre-Production System

Feedback from experiments on required services


For the pre-production services (PPS):

- LFC catalog required at all pre-production Tier1s

- FTS server at all pre-production Tier1s and at the pre-production Tier0 - FTS channels between all pre-production Tier1s. FTS channels to/from a pre-production Tier1 to all its associated pre-production Tier2s. CERN should have FTS channels defined to all pre-production Tier1s.

- (there is an "SRM" column on the excel sheet with the list of services: I guess this is required by default right?)

[ for more details, please check ATLAS SC4 plans during Mumbai workshop ]

For the production services (PS): - exactly the same components as for PPS - BUT we do not need the new FTS Tier1 servers/channels for Production before June (SC4 start). We just need what we have today (FTS server at Tier0 with channels to all Tier1s). What we also have today - local LFC production catalogs at Tier0 and Tier1s and VO BOXes at Tier0 and Tier1s - is all we need up until SC4 start. (no new FTS requirements for the production service until SC4 start/June)

> An important point on VO BOXes < ATLAS requires VO BOXes at Tier0 and Tier1s only. BUT we do not require sites to deploy a separate VO BOX for the production or pre-production, as our code can easily coexist in the same box and is not CPU/disk intensive. So, for ATLAS, sites may claim the VO BOX PPS and VO BOX PS to be the same machine with the same endpoint. If sites or LCG decide to deploy two instance that's fine of course but there's no real gain in doing so. * The catch here is updating gLite FTS client libraries - there may conflicts between multiple versions! * The two separate instances of our code can point to either PPS or PS.

Metrics for experiment testing of PPS


- Pre-production service will be used as soon as it is available and its usage won't go away when SC4 starts. There may be periods where the pre-production service is not extensively used, but the goal is from now on to always develop against the pre-production service.

- The first usage of the PPS will be for an intense Tier0 export test on March/April (one week). LCG&sites are free to propose the most convenient week. The goal is for all intervenient sites to accomplish their MoU rates. This exercise would ideally run on the production service but the goal is to exercise data management using new m/w (and we expect to have faster upgrade cycles in case of problems, by using the PPS). We expect data to be stored on tape during this exercise, but may be scratched (along with all catalog/disk entries) at a later date - to be discussed with sites. This will be very much an exercise on the Tier1 storage elements! More details on this exercise will be sent out soon but again, please take a look at our Mumbai presentation for rates, etc - Before SC4 starts, the PPS will also be used for a distributed production exercise, to test the integration of data management and workload management.

Review of Experiment Plans and Site Setup

ATLAS WMS plans for 2006

ATLAS is currently running productions on the EGEE Grid using two job submission systems: the EDG Resource Broker and the Condor-G based system. Both of them are interfaced to the ATLAS production system (ProdSys) and share part of their code base.

We tested during the last few months of 2005 and the beginning of 2006 the functionality and performance of the new gLite WMS, and adapted our ProdSys executor Lexor to use it. The gLite WMS has new features with respect to the old RB that make it more attractive, and usable: faster response time, interface to the data management system, possibility of bulk submission. Although those tests were performed in a restricted environment, they showed that, already within that environment, the new WMS performs better than the old RB.

We therefore expect the new WMS to be available for large-scale testing in the context of the gLite 3.0 release, first on the pre- production service, then in the SC4 setup.

During 2006, in the context of SC4 but also for other distributed productions, we plan to submit jobs to the EGEE Grid using the gLite WMS and compare its global performance for production and analysis job submission with the Condor-G based system. Without such a large- scale test and comparisons, it will not be possible to take informed decisions on the best way to submit Grid jobs for ATLAS.

Therefore we intend to make intense use of both gLite and Condor-G based systems during the full year 2006; we assume that their performances will confirm the present results and continue to evolve in line with the ATLAS requirements. In this assumption only these two systems will be used in the ATLAS productions on the LCG resources in the EGEE Grid in 2006.

Dario Barberis

Status of Background dTeam Transfers


  • Please submit reports beforehand so we address only points of clarification and / or outstanding issues.




This topic: LCG > WebHome > LCGServiceChallenges > ServiceChallengeMeetings > SCWeeklyPhoneCon060313
Topic revision: r2 - 2006-03-07 - JamieShiers
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback