TWiki> LCG Web>CRAB3EGICF2010Abstract (revision 11)EditAttachPDF

CRAB3 abstract for EGI CF 2012


CRAB3: A new era for end user data processing and production in CMS


The CMS Remote Analysis Builder (CRAB) is a tool which addresses the needs of the CMS community, allowing the users to easily access the resources offered by the Grid. It aids users in configuring CMS applications for remote execution, discovering the location of input datasets, submitting and monitoring jobs through the distributed Grid infrastructure. CRAB has progressed from a limited initial prototype nearly 5 years ago, to a fully validated system heavily employed by the CMS collaboration. The CRAB team has an ambitious program planned in 2012: to release a new generation of CRAB that aims to make a step towards a SaaS architecture. Two key concepts of the new implementation are the centralization of services and the reduction of the sustainability cost. This work will present the joint CMS experiment and CERN IT-ES effort to realize such project, highlighting the impact on the service maintenance and first experiences dealing with beta users.


Taking the experience gained from previous CRAB versions, developers plan to release a new version of the tool which aims to improve the sustainability of the service besides solving known issues and bottlenecks. CRAB will be centrally deployed as an online service exposing a REST interface. Services offered by the server will be accessible through a lightweight client, which will send user requests to the server interface. The server is composed by a multi-tiered architecture where each tier takes care of performing specific functions in the chain. In addition to the interface which is the entry point to the service another central service is the WorkQueue. This tier of the system takes care of providing a central queue for all the user requests, and manages the priorities between users/requests themselves. The service is then composed of distributed end tiers -called agents- which, depending on the available storage and computing resources, pull user requests from the central queue and perform the interaction with the underlying Grid layer. A multi layer online service that does the work behind the scenes abstracts the objective of the analysis activities from the infrastructure allowing the users to focus just on the analysis side. Moreover, a relevant impact on the analysis use case is also given by the introduction of such a multi-tiered architecture which allowed developers to introduce new components to improve the workflow result management, which was one of the major CMS analysis workflow issues. This is accomplished by using the resource local storage space as a temporary cache for the job output files and then moving these through an asynchronous transfer. Transfers are enabled through third party copies by the gLite FTS servers performed once the job is completed. Since the architecture described above is built on top of a commonly developed library (named WMCore) this improves the long term maintainability and sustainability of the tool.


CMS currently observes more than 400 unique users submitting CRAB jobs per week, with close to 1000 individuals per month. The CMS Computing Technical Design Report (CTDR) estimated roughly 100k grid submissions per day. During the second half of 2011 the job submissions constantly exceeded the estimate by 40-50%. Finally CRAB has been used extensively to prepare over 100 analysis papers published by CMS. For all these reasons the experiment aim to improve the reliability, usability and scalability of the analysis system as well as to reduce the human effort needed for the analysis operations. We believe the transition to CRAB3 is extremely valuable for the success of the CMS Computing in reaching the mentioned objectives.


At the time of writing the new version of CRAB is on the process of consolidating the basic functionalities. It is close to enter the commissioning phase after which CMS will start the transition from CRAB2 to CRAB3. We present the status of the project and the achieved experience during the integration period.

Track classification

Software services for users and communities



-- SpigaDaniele - 14-Nov-2011

Edit | Attach | Watch | Print version | History: r20 | r13 < r12 < r11 < r10 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r11 - 2011-11-28 - MattiaCinquilli
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback