Enabling New Communities on the Grid

Introduction

This document, targeted to experienced user in a new VO or to technical support people, is aimed to give an overview on the current process of setting up a new community (either a VO or a non-Grid community) at CERN and also to provide feed backs on possible improvements of the work flow as a result of the exercise carried out to enable a couple of real communities (ILC and EnviroGRIDs) . It is the result of different interviews had with many people currently involved in the process. A check list of things to do is finally provided. For sake of completeness we will also document steps for creating a Grid VO and steps followed by service managers to configure a new VO in their own managed services (CE,SE, WMS, UIs WN). The document tries to cover both resources (established) and resourceless communities. In the second case it is important that the new VO representative comes with the support of some member of IT-ES group that will have to coordinate with various service managers the process and filter wrongly formulated requests. The IT-ES member will ensure service managers managers that a given not-established community is supposed to be supported by CERN.

How to create a new VO?

(Information extracted from the CIC portal and reported here for convenience)

  • Make sure you comply with the mandatory instructions available here
  • Fill in the web form here
    • It is important in this step to fill up also another important document, the AUP Acceptable Use Policy duly compiled in all its parts. This is the document describing the mandate of the community and basic rules (security, morality and others) for using the Grid infrastructure reminding to the Grid Acceptable Usage Rules, VO Security Policy and other relevant Grid Policies. The owner body must be specified as well. In case of EnviroGRIDs is for example enviroGRIDS @ Black Sea Catchment collaboration. An example of AUP is given below.
  • Wait for validation by the RAG
    • Please note that the RAG body is constituted of two persons. Some delay might be expected when the service is uncovered (ex. simultaneous vacations)
  • Setup your VOMS server to be able to use the EGEE/EGI infrastructure, and enter the EGEE/EGI grid as an 'active' VO
  • Maintain and keep up to date all information relative to your VO on the VO ID cards of the CIC Operations Portal

Example of AUP:

This Acceptable Use Policy applies to all members of envirogrids.vo.eu-egee.org Virtual Organisation, hereafter referred to as the VO, with reference to use of the LCG/EGEE Grid infrastructure, hereafter referred to as the Grid.
The EnviroGRIDS @ Black Sea Catchment collaboration owns and gives authority to this policy.

Goal and description of the VO: 
The Black Sea Catchment is internationally known as one of ecologically unsustainable development and inadequate resource management, which has led to severe environmental, social and economic problems. The EnviroGRIDS @ Black Sea Catchment project addresses these issues by bringing several emerging information technologies that are revolutionizing the way we are able to observe our planet. The Group on Earth Observation Systems of Systems (GEOSS) is building a data-driven view of our planet that feeds into models and scenarios to explore our past, present and future. EnviroGRIDS aims at building the capacity of scientist to assemble such a system in the Black Sea Catchment, the capacity of decision-makers to use it, and the capacity of the general public to understand the important environmental, social and economic issues at stake. EnviroGRIDS will particularly target the needs of the Black Sea Commission (BSC) and the International Commission for the Protection of the Danube River (ICPDR) in order to help bridging the gap between science and policy.

Members and Managers of the VO agree to be bound by the Grid Acceptable Usage Rules, VO Security Policy and other relevant Grid Policies, and to use the Grid only in the furtherance of the stated goal of the VO.

Setting up a VOMS server is a site-specific issue. For setting a VOMS server at CERN for example the procedure will be:

  • Submit a request to voms.support@cernNOSPAMPLEASE.ch or directly to steve.traylen@cernNOSPAMPLEASE.ch. He will check with LCG top management whether to approve or not the request
  • Identify a VO Manager
  • Setting up VOMS and VOMS-ADMIN interfaces and then the VOMRS service for the new VO. An example is here.
  • Grant the VOC to be VO deputy (this is something that the VO- Manager can do via VOMRS). This step is to offload the VO manager and also to guarantee the backup.
  • Finally - for setting voms clients - details are available here but change any mention of voms114.cern.ch or voms113.cern.ch to just voms.cern.ch.

The document VO_Registration.pdf descrive in some details the VO registration procedure as of the beginning of EGEE project.

More information are available at the CIC portal

A bootstrap twiki describing in some details how to start using grid resources via Ganga might be found here

Overview of the process to enable a VO at CERN

CERN has a well defined policy for computing&storage resources allocation divided in 4 categories:

  1. LHC experiments: 85%
  2. non-LHC experiments 5%
  3. Analysis 5%
  4. R&D 5%

In this section we describe the process to allow a new (non LHC) VO to run at CERN. Usually the process is triggered by the spokesperson/responsible of the new community meeting the high level IT management. In this first appointment the use case of the community is roughly exposed. This is a rather informal meeting about the principles and only few technical details are really sorted out. Once there is the agreement the process moves to another level where details are sorted out via several interactions between the community - usually supported by Grid experts - and IT procurement/service managers. From now on we will refer to "VO representative" or to "technical person in the new community" as the person that knows in some details the needs of the VO and how to translate them in terms of grid service requirements. Please note that this figure might even not be a member of the collaboration but rather (specially for small communities without the needed know-how) the assigned VO contact person in CERN-IT for consulting and supporting the experiment in WLCG (IT-ES member). At this stage Bernd Panzer (resource manager) will decide on an estimate of the resources (cpu, disk, tapes, network, services, etc.) needed for the exposed use case, integrate that into the IT planning and provide architecture/data-flow consultancy for the new project. The use case is analyzed, needs are translated into figures and then an attempt is made to match these needs with the resources allocated for the ongoing year. This is a relatively important step because the potential need to charge the new community depends on that fitting of resources. Once this negotiation phase ends one or more Remedy Tickets are dispatched to all requested service managers (LSF, AFS, CASTOR) either internally (ex. Bernd to LSF support) or directly (new VO to castor.support). At this point in time the new community must have its own (2-character) linux group and all grid VO matters must have been properly addressed; please go to the check list to verify what are the aspects that the support contact person has to (help the VO to) determine. As of today the advancement of the process is not easily traceable and priorities are not properly handled, the Remedy Ticket system being rather opaque on that. Expected time lines are not explicitly set in the process and it is not clearly documented what are the CERN commitments. Review process of resources is minimal, as there should not be any major conflict.

Technical steps to add a VO on CEs

We report here the steps required by Computing facilities service managers (IT-PES) to setup a new VO in the system and what pragmatically the support in the process for the VO (or the VOC) has to do to speed up things.

Pool accounts creation

Send a request to helpdesk@cernNOSPAMPLEASE.ch to create pool accounts. The request consists of:

  • Asking to create a new service provider in the CRA system for the new VO. This represents an abstract entity owning all service accounts required by the VO to run in the Grid. It is the legal entity responsible for the pool accounts either production (like ilcprd001), user (ex.ilc035) or software manager (ilcsgm) accounts. It must be associated to this service provider a CERN staff member of the collaboration that becomes responsible.This staff member is the VOC that must be appointed previously. This is a way to prevent that the physical person explicitly stated to be the responsible for the VO pool accounts and leaving CERN is also causing the closure of these accounts in the system and then preventing the collaboration to run in the grid (this happened once in ATLAS). In CRA system, the service provider for GRID accounts will be something like:
<VO> (space) GRID_USER

  • Asking to reserve UIDs (aka to allocate) for your grid pool accounts. All accounts will belong to the same service provider and GID will be the one registered for your collaboration. A list of them must be provided with in mind the size of your collaboration (see final check list) . LHC VOs at CERN have between 300 and 1500 pool accounts and the number must be proportional to the number of people registered in VOMS. Usually for small collaboration this number is: 50 for normal pool accounts, 5 for production pool accounts, 1 static sgm account and 5 pilot pool accounts. Helpdesk people are adequately instructed to setup this directly with the CRA support staff. Please bear in mind that these accounts are special in the sense that they:
    • have a random password which nobody knows.
    • have a local home directory, possibly with the exception of the software manager accounts (if any thahas a AFS directory)
    • login is blocked and all have no-expiration (also happened in the past to ATLAS).

Shared area

In order to configure properly the grid queues, IT-PES people need to have defined the shared area for the new VO and its mount point. For this reason, once the new VO has successfully negotiated with CERN resources managers the amount of CPU and storage, a request to setup also a new AFS project (with new AFS volumes) for the collaboration has to be issued by sending a mail to afs.support@cernNOSPAMPLEASE.ch. Within the same request, the supporter must ask them to allow the new SGM (software manager) accounts, if any, to get AFS tokens from grid credential (gssklog). So far the shared area for software installation was handled by setting up volumes accounted on the old project IT-GD. The mount point was:

/afs/cernch/project/gd/apps/<VO> 

and a list of people of this old project could setup quickly the area for you. Today new communities have to ask for a new AFS project from scratch. Please bear in mind that AFS people are concerned on legal and security matters and need a name of a responsible to manage and administer the volume that is going to be setup. Managing a AFS project is not technically difficult (exhaustive documentation on AFS management is available here). The responsibilities of operating is rather a different business; a VO representative (VOC) with the right competences that has to dress them.

LSC and VOMS certificate files

Your first test will fail! Indeed either the voms server hosting your VO is at CERN or elsewhere, all grid nodes that you have to use will require to know the URI of the VOMS server to verify the signature of your voms proxies.

  • Grid services: if your very first attempt to submit a job does not work, you will have to ask explicitly IT-PES people running the CE and WN to properly configure the default configuration for ncm-yaim to take into account the new VOMS server address (LSC). The siteinfo.def that must be produced (used in turn by YAIM configuration tool) should look like:
----------------------------------------------------------------------
VO_ILC_VOMSES="\
'ilc grid-voms.desy.de 15110 \
/C=DE/O=GermanGrid/OU=DESY/CN=host/grid-voms.desy.de ilc 24' \
"
VO_ILC_VOMS_CA_DN="\
'/C=DE/O=GermanGrid/CN=GridKa-CA' \
"
----------------------------------------------------------------------

This operations is done once for all centrally managed grid services which use this default configuration. That is the case for the CEs and the WN. The result on the grid node will look like:

[root@lxbsq1234 ~]# grep ILC /etc/lcg-quattor-site-info.def
GRID_ILC_GROUP_ENABLE="ilc "
VO_ILC_DEFAULT_SE="srm-public.cern.ch"
VO_ILC_QUEUES="grid_ilc"
VO_ILC_STORAGE_DIR="/castor/cern.ch/grid/ilc"
VO_ILC_SW_DIR="/afs/cern.ch/project/ilcgrid/sharedarea"
VO_ILC_VOMSES="'ilc grid-voms.desy.de 15110 /C=DE/O=GermanGrid/OU=DESY/CN=host/grid-voms.desy.de ilc'"
VO_ILC_VOMS_CA_DN="'/C=DE/O=GermanGrid/CN=GridKa-CA'"
VO_ILC_VOMS_SERVERS="vomss://grid-voms.desy.de:8443/voms/ilc?/ilc/"
VO_ILC_DEFAULT_SE="srm-public.cern.ch"
VO_ILC_QUEUES="grid_ilc"
VO_ILC_STORAGE_DIR="/castor/cern.ch/grid/ilc"
VO_ILC_SW_DIR="/afs/cern.ch/project/ilcgrid/sharedarea"
VO_ILC_VOMSES="'ilc grid-voms.desy.de 15110 /C=DE/O=GermanGrid/OU=DESY/CN=host/grid-voms.desy.de ilc'"
VO_ILC_VOMS_CA_DN="'/C=DE/O=GermanGrid/CN=GridKa-CA'"
VO_ILC_VOMS_SERVERS="vomss://grid-voms.desy.de:8443/voms/ilc?/ilc/"

  • If some WMS nodes also need to support the new VO (you might or might not submit via WMS services CERN), they do need the host cert(s) of the VOMS server(s). You have to send a mail to wms.support@cernNOSPAMPLEASE.ch and explicitly ask them to do it; the certificates could be downloaded from the VO view in the CIC Portal:

Select the VO and scroll down to the VOMS server section(s). The download link in the example of ilc is available here Then the cert has to be turned into an rpm, if the VOMS server is not already used by another supported VO, in which case the rpm should already be present.

The rest of this section is something carried out internally by IT-PES and is rather out of the VOC control.

Create home directories on lxbatch and on the CEs

Currently, the home directories are created by the installation of an RPM. This will be reviewed.

  • The fs CVS repository contains a package LCG-localuser which contains a script which creates the spec files for these rpms. PES will have then to update the script, run it, and build the new version of the rpms with rpmbuild -bb specfile (for each of them).
  • upload the created rpms to the SWrepository
  • update CDB, and add the new rpms for the new VO
  • run spma on the batch nodes and the CEs

Allow job submission to new pool accounts

  • add access to pro_access_control_grid and pro_type_lcgce3_1_x86_64_slc4

/etc/group settings for tomcat on SLC4/SLC5

This step was only necessary when adding a new user community with their own GID on a CREAM CE. As of CREAM 1.6 this step is no longer needed. On a CREAM CE, yaim will manipulate /etc/group. These settings will get lost when regis runs. To work around this, a special version of CERN-CC-regis_client is used which supports an extra post-script. This post script is contained in the CERN-CC-cream-ce-hacks. It simply does a usermod to restore the settings. One needed to add the new group to the list of tomcat's secondary groups.

Prepare LSF group

You probably want to group the new pool accounts into an LSF group, to assign them a share (or subgroup fraction). For this, you need to update lsf_create_shares which is the back end of LSFweb.

Prepare yaim input files

In LCG-create-users_conf you can find the script create-YAIM-users_conf, which creates the users.conf and group.conf input files for yaim, mainly relevant on the CEs, later also on the WN. This is under review for SLC5. The groups.conf comes from CERN-CC-lcg-ce-hacks. It needs to be updated in CVS. Then build the new rpm and deploy it on the CEs

  • update the script and create a new version of the rpm
  • update lcg-info-dynamic-scheduler.conf to ensure that the generic scheduler plugin recognizes the new LSF group properly.
  • create a new version of the hacks rpm
  • upload both to the SWrep and install it on the CEs
  • run SPMA on the CEs
  • run the yaim script on the CEs
  • reconfigure them

Check list of things to enable a community

In this part we report the list of points that a responsible of a new VO needs to follow to enable his VO at CERN (from now on: VOC). This is something that has to be done by a technical person within the VO with eventually the support of a contact person within IT-ES group. Please note that if the community has already a clear computing model available, providing these information reduces to determine details of its implementation. Some of these information should have been already analyzed in earlier phases at VO registration time (see new VO registration) and made already available in the existing VO Id Card.

  • Has my spoken person met the IT management and received any formal approval?
  • Try to understand the structure of your organization.
    • How many users/accounts do you think this community will be constituted of? How many users are currently supposed to access CERN resources? Who's the contact person for security matters? Who's the contact for any communication to the VO? What is your VOMS server URI and its certificate? (to be used for building up static gridmap files and to allow voms clients to work properly and services to accept your user's credentials). A VOMS URI describes completely the service and an example of information to provide could be:
      host=lcg-voms.cern.ch, https port:8443 voms port: 15003
    • Identify special roles. Try to think about possible responsibilities that can be exploited via the grid and sort out a list of FQANs (Fully Qualified Attributes Names) that would appear in the groupmap file (used in turn by dynamic mapping mechanisms). As a simple rule of thumb: "sgm" has write permissions on the shared area; production is a role dressed by a restricted number of people with capability to read/write specific storage areas where data from production activities are stored; pilot indicates the capability to submit empty jobs that will then instanciate a real payload.; NULL or user roles are for normal user to run their private (analysis) activities on the grid. An example of information that could be provided to sys managers is:
      • the FQAN /lhcb/Role=lcgadmin has to be mapped to lhcbsgm static accounts.
      • the FQAN /lhcb/Role=pilot has to be mapped to lhcbplt[01-10] pool accounts.
      • the FQANs /lhcb and /lhcb/Role=user have to be mapped to lhcb[001-050] pool accounts
  • Try to evaluate your computing needs for the solar year.
    • How many events you have to process? How long (HS06.sec) does each event take? How many jobs do you expect to run? How many concurrent jobs?
    • Do you need special OS/architecture flavor to run your application? Do you need glexec (is your Computing Model envisaging to run generic pilot jobs)? VM?
    • Do you need a dedicated LSF queue? Do you just need a guaranteed share in LSF? How long should be the longest queue to fit your longest jobs? Do you need other dedicated queues and define special priorities for them (ex. to run your Nagios probes and grabbing quickly a WN)?
    • Do you need sgm accounts? How do you install your software? How big has to be the shared area?
    • How much (virtual) memory do your jobs require?
    • Do you need scratch area in the WN? How much?
    • Any special requirement on the WN-SE connectivity (how large is the input/output of your jobs times number of envisaged jobs)?
    • Do you need a CREAMCE?
    • Do you envisage spikes in your activity or it is rather continuous in the year?
    • By when do you exactly expect these resources to be in place?
  • Try to evaluate your storage needs for the solar year.
    • How much storage do you need at CERN? How much disk? How much Tape?
    • Do you need specific service classes to be configured (CustodialNearline, CustodialOnline, ReplicaOnline, scratch) for your data? Could you try to breakdown the allocation of resources per type of service class?
    • Do you need space token configured at SRM level? How do you want them to be mapped to the Service Classes?
    • Could you try to envisage which connectivity to other centers is minimum required? and internally, to the WNs?
    • Do you need gridftp/xroot/root/rfio protocols and their respective daemons setup? How many disk servers (how many parallel connections do you think will be instanciated in parallel at most?
    • Do you need special permissions at stager level (ex. can everybody issue a stage-from-tape request? only a subset? only well identified DN?)
    • By when do you exactly expect these resources to be in place?
  • Any other service required to be put in place at CERN?
    • Do you need dedicated boxes to run your services (VOBOXes)? Which ports are you services eventually running on and need to be open?
    • Do you need a dedicated UI?
    • Do you need an area on a LFC server? Do you need FTS channels to be setup?
    • Do you ave special requirements as far as concerns security? Is your data sensible (this applies to specific communities like biomed)
  • Contact Bernd Panzer to translate these requirements into valuable figures to fit into the yearly allocated resources at CERN ad try to fit them.
  • Appoint a VOC in your VO. For more information on the VOC mandate just have a look at this twiki. It will become the liaison of your VO with CERN IT department as well as the responsible and contact person in the VO for all IT matters.
  • Contact Nicole Cremel to create your own linux group for the project (a list of groups available here); this will trigger the LSF, AFS ,etc. creation of your groups and permissions.
  • As VOC (or as support in IT-ES) submit a request to helpdesk to create your grid pool accounts as documented here.
  • As VOC (or as support in IT-ES) submit a request to afs.support to trigger the process of setup of a shared area and AFS project for your VO, as documented here
  • Depending on the agreement with Bernd contact (as VOC) both lsf.support and castor.support to request to setup and configure for your VO the services to access your resources. Here an example of RT to setup CASTOR resources previously agreed with Bernd.

Asking for more hardware (VOBOXes)

Once the VO is established, the VOC is responsible also for managing VOBOX. A VOBOX is a generic box with a tiny layer of grid m/w (just for proxy enabled login, gsissh) used to host experiment specific services. More information about the VOBOX service are available here. In order to ask for extra h/w as VOBOX service the VOC/supporter is first invited to check the general documentation available on the VOBOX administrative pages. CERN provides a web interface to forward new hardware request. If for example an extra VOBOX is required the VOC must fill the web form available at this link. This process follows an internal procedure documented here.

Verify that resources have been properly set for your VO.

Reminding to this page for a ganga based crash course to start using resources in Grid this section is mainly aimed to provide practical tips to check that basics resources are effectively allocated at CERN.If all example below will work for your VO, it means that you are entitled to do basic operations on your Computing and Storage Resources at CERN.

Verify the CE

  • The first step is to verify whether you got some share in LSF. For this reason just check this page (AFS account required) and go to "Group" tab looking for your VO group.
  • Check that your VO is entitled to submit jobs through Grid CEs at CERN. Please note that it must be enabled also on the default glite WMS that the AFS UI at CERN must have enabled your VO.
    • Log on lxplus
    • Setup the Grid UI environment and create a valid VOMS proxy (replacing lhcb in the example with your VO. Please note that the voms server hosting your VO must be in the default if everything is properly configured.

[lxplus313] ~/scratch0/grid/lhcb $ source /afs/cern.ch/project/gd/LCG-share/sl5/etc/profile.d/grid-env.csh
[lxplus313] ~/scratch0/grid/lhcb $ voms-proxy-init --voms lhcb
Enter GRID pass phrase:
Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=santinel/CN=564059/CN=Roberto Santinelli
Creating temporary proxy ................................. Done
Contacting  voms.cern.ch:15003 [/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch] "lhcb" Done
Creating proxy ..................................... Done
Your proxy is valid until Thu Jul  1 23:15:50 2010

    • Create a simple JDL as in the example. This is generic and can be used just copying and pasting.


[lxplus313] ~/scratch0/grid/lhcb $ cat cern.jdl 
Executable    = "/bin/hostname";
StdOutput     = "std.out";
StdError    ="std.err";
OutputSandbox = {"std.out","std.err"};
Requirements =  ( Regexp("cern.ch",other.GlueCEUniqueId) ) ;

    • Submit your first job targeted to CERN and check the status.

[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-submit -a cern.jdl

Connecting to the service https://wms203.cern.ch:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms203.cern.ch:9000/5mLbESxs3lnPm5oDXjMVfA

==========================================================================

[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-status https://wms203.cern.ch:9000/5mLbESxs3lnPm5oDXjMVfA


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms203.cern.ch:9000/5mLbESxs3lnPm5oDXjMVfA
Current Status:     Running 
Status Reason:      Job successfully submitted to Globus
Destination:        ce114.cern.ch:2119/jobmanager-lcglsf-grid_lhcb
Submitted:          Thu Jul  1 11:18:38 2010 CEST
*************************************************************

  • check that the mapping are properly set by submitting the same JDL in the example above using multiple FQAN (according the example below). If all roles selected are OK and you can retrieve the output of your jobs with the name of the batch node in the CERN farm they run on, it means that your VO has been fully enabled on the CERN computing farm.
[lxplus313] ~/scratch0/grid/lhcb $ voms-proxy-init --voms lhcb:/lhcb/Role=production 
[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-submit -a cern.jdl 
...
[lxplus313] ~/scratch0/grid/lhcb $ voms-proxy-init --voms lhcb:/lhcb/Role=pilot 
[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-submit -a cern.jdl 
...
[lxplus313] ~/scratch0/grid/lhcb $ voms-proxy-init --voms lhcb:/lhcb/Role=user
[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-submit -a cern.jdl 
...
[lxplus313] ~/scratch0/grid/lhcb $ voms-proxy-init --voms lhcb:/lhcb/Role=lcgadmin
[lxplus313] ~/scratch0/grid/lhcb $ glite-wms-job-submit -a cern.jdl 
...
...
...
  • Note: the above commands might not work. This could mean that the AFS UI is not still fully configured with your new VO customization. In order to verify few services you should then use few local configuration files that will help you in setting up a proxy and submitting a grid job.
    • Adding your voms endpoint to the list of local vomses files under $HOME/.glite/vomses as shown in this example (please note that the content of this file must be provided by the voms managers once setup the server)
[lxplus223] ~/.glite/vomses $ pwd
/afs/cern.ch/user/s/santinel/.glite/vomses
[lxplus223] ~/.glite/vomses $ cat vo.envirogrids.vo.eu-egee.org-voms.cern.ch 
"envirogrids.vo.eu-egee.org" "voms.cern.ch" "15012" "/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch" "envirogrids.vo.eu-egee.org" 
this will allow you to issue a valid proxy for your VO
[lxplus223] ~/.glite/vomses $ voms-proxy-init --voms envirogrids.vo.eu-egee.org
    • Submitting a job specifying the WMS configuration file (that should inform which endpoint has been setup for the newly created VO):
[lxplus223] ~/public/lucasz $ glite-wms-job-submit -a -c envirogrid_wms.conf envirogrid.jdl 
where the configuration file must contains the endpoint enabled for your VO:
$ cat envirogrid_wms.conf 
[
OutputStorage  =  "/tmp";
JdlDefaultAttributes =  [
    RetryCount  =  3;
    rank  = - other.GlueCEStateEstimatedResponseTime;
    PerusalFileEnable  =  false;
    AllowZippedISB  =  true;
#    requirements  =  other.GlueCEStateStatus == "TestbedB";
    requirements  =  other.GlueCEStateStatus == "Production";
    ShallowRetryCount  =  10;
    SignificantAttributes  =  {"Requirements", "Rank", "FuzzyRank"};
    MyProxyServer  =  "myproxy.cern.ch";
    ];
virtualorganisation  =  "envirogrids.vo.eu-egee.org";
ErrorStorage  =  "/tmp";
ListenerStorage  =  "/tmp";
WMProxyEndpoints  =  {"https://wms207.cern.ch:7443/glite_wms_wmproxy_server"};
]

Verify the SE

  • Log on lxplus. Ask to CASTOR service managers which stager host and pools have been allocated to your VO and you can check and then verify the space allocated to your pools. For example for lhcb:

bash-3.2$ export STAGE_HOST=castorlhcb.cern.ch
bash-3.2$ export STAGE_SVCCLASS="*"
bash-3.2$ stager_qry -si | grep POOL 

  • Copy a file into your space tokens through the SRM endpoint given to you. You might not have space token (then just remove the option from the command line below) nor a dedicated SRM access point (then you will access your storage through srm-public.cern.ch). We keep the example minimal in case space tokens are requested but we do not consider that LFC is used by the community.
bash-3.2$ lcg-cr -v --vo <VO> --connect-timeout 20 --srm-timeout 20 --sendreceive-timeout 20 -D srmv2 -s <Space_token>  file:///etc/group -d srm://srm-<VO>.cern.ch//castor/cern.ch/grid/<VO>/test.$$

where is a string published in the BDII and for example for LHCb it looks like LHCb_USER or LHCb_FAILOVER.

  • Check that the file is properly copied using grid tools (but you could also use nsls) and copy it back.
bash-3.2$lcg-ls -l srm://srm-<VO>.cern.ch/castor/cern.ch/grid/<VO>/test.$$
bash-3.2$lcg-cp  srm://srm-<VO>.cern.ch/castor/cern.ch/grid/<VO>/test.$$ /tmp/test.$$

Issues encountered and possible improvements

There is evidence that currently the process for enabling a new VO at CERN is not well defined and not well documented. Already at the top management level there are too many degrees of freedom that would generate room for confusion and highlight the lack of formality in this "gentlemen agreement based" mechanism. The chair of a new community might just want to send a private mail to the IT head or to meet privately with the LCG project leader or issue a request to IT via CERN DG or just ask for a meeting with one (or all) of the stakeholders. Presenting a new VO exclusively via a single CERN body (ex. Physics-Services meeting) would reduce the margin of uncertainty. A presentation of the use case and requirements would take place at this level. The formal request (to be further discussed and negotiated) should finally result in an electronic document: either an RT request or - better - a new ad-hoc AIS based application (ex. an EDH flavored document).

This document would be the initialization of the entire process and would allow tracking all steps and all statuses with all actors involved at any time. Formalizing this process with such an electronic document, with electronic signatures of the various stakeholders is of importance not just at this high level step but in general at all subsequent - technical - steps. Indeed, the recent case of ILC revealed that how things are currently done may lead to inconsistent situations. CASTOR support had received a request to set up something at storage level via an RT request issued directly by the responsible of the community in situ, after its chairperson had run a meeting with the high-level management of IT and had met with Bernd Panzer. At the same time that ILC had Quattor templates set up for SRM and CASTOR, the LSF computing facility people were completely unaware of any request to set up a group and a share on the computing resources, whereas in the end the Quattor setup should be shared as much as possible. With a single EDH-like document these inconsistencies are avoided by design. Any other request outside the formal path could simply be discarded by service managers, who would expect a clear document fully approved by their line management.

The structure of this document should also envisage a well standardized form to be filled in accurately by the responsible for the new VO; here all requirements must be provided that service managers feel relevant in order to properly set up the services for the VO .Service managers receive often a formal request (via mail or via the RT) to set up or configure some new VO specific service or asking for new hardware, but usually such a request comes incomplete and many other interactions with the issuer are needed, wasting time and resources. Standardization in these areas would also be warmly welcomed by CASTOR and LSF support. Further to the need of formalizing and standardizing the work flow for new h/w or new services, a need for some reference in terms of CERN commitments has come to attention. Typical examples: When do you want this in place? Who has to pay in case the target is not achieved? What are the consequences? The proposed structure of the electronic form should also offer the possibility of setting priorities, time lines and expectations . CERN IT managers would commit to matching or redefining them e.g. with the client. For example: who has to set up yaim configuration files for the new services? What does the VO (or its support) have to set up further? In that respect the current internal ticketing system based on Remedy is fairly opaque. The newcomers might not be experts of what happens behind the IT scenes and a reference point would help them in evaluating the progress. A final observation: the traceability of the requests. The process as a whole is quite complex with many factors, actions and actors involved at many levels. Delays or stops might happen and would have to be reported and/or justified. It would help a lot to have a single entry point that allows (otherwise blind) requesters to follow the work flow.

-- RobertoSantinel - 07-Jun-2010

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf VO_Registration.pdf r1 manage 52.9 K 2010-08-07 - 15:42 UnknownUser  
Edit | Attach | Watch | Print version | History: r23 < r22 < r21 < r20 < r19 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r23 - 2011-01-17 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback