Release Planning

We are now planning a timetable for LCG-2_7_0, expected before Christmas.

Would those involved in any of the components below please send a status update to Oliver.

The list is not sorted by priority, or any other metric.

The List

Category Item Responsible Priority (1-5) Completion (%)
Bug fixes check for any critical bugs outstanding Oliver 5 0
OSG Components needed for OSG interoperation Laurence 5 0
VO-BOX Job submission capability via condor x 1 0
VO-BOX The VO-BOX is published in the info system. We need a mechanism that allows the VOs on the box to add their own key value pairs. Laurence 3 0
VO-BOX List of trusted domains / networks for iptables configuration Maarten 4 0
Info system Cleanup of the info providers to make full use of the new glue schema Laurence 0 0
Info system Update of info providers to publish versions for each service Laurence 3 0
Info system Add the GlueSchemaLocation for exp installed software Laurence 0 0
Info system Add a key value pair to the info provider for services that declairs the service as being part of production. This is already published for some services, but YAIM is not configuring this. Laurence 0 0
Info system VOMS server info provider Laurence 0 0
Info system Jeff Templon's ETT (the NIKHEF ETT RPMs) Jeff 5 0
LFC/DPM ReadOnly LFC - replication Jean-Philippe 4 0
LFC/DPM VOMS enabled DPM and LFC Jean-Philippe 2 0
LFC/DPM srm-copy Jean-Philippe 0 0
LFC/DPM using the internal catalogue as a local file catalogue Jean-Philippe 0 0
Backup Back up mechanism for the mission critical DBs on the T2 and smaller centers. Mainly DPM, local LFCs, dCache internal DBs. These are critical when lost. Has to be simple, with an option to sent the backup up the chain to the corresponding T1 center (data management tools??) Probably supplied in the first instance as a HOWTO x 2 0
Separation of state and processing Services like the CE, RB, MyProxy ... maintain extensive state information that needs to be available to restart the service in case the node crashes. We have done this for the RB, now the other services should follow. Andrey 0 0
Job monitoring tools (job mon and stdOut/Err mon) Update, reflecting the input that we receive by the users (some feedback has been already given). Laurence 0 0
Job monitoring tools (job mon and stdOut/Err mon) Ensure these are off by default. Laurence 0 0
RB Currently there is no explicit link between the jobID and the VO in the Logging and bookkeeping DB. This makes queries like: Show me the state of all jobs of my VO a bit complex. Which are the queries that the experiments would like to see. David 0 0
Security make the fork job manager be properly authorized as previously discussed Maarten 0 0
Security determine what can and can not be done about pool account recycling, the threats it poses and what we can do about it, if anything x 0 0
Security signed rpm distribution might also be nice. Louis 0 0
BDII The top level BDIIs should be published as services by the site BDII Laurence 0 0
BDII we need an info provider that can add a few values that reflect the load on the node Laurence 0 0
VO management via YAIM Problem: Currently we decide at CERN which VOs are already there by default (and which are not). Possible solution: A web based tool that displays all VOs with a short description and a comment by the ROC managers of the sites region. The site then selects the VOs and assigns shares (in case the site uses a fair share scheduler). The tool then creates the VO dependent information for YAIM. A clear distinction between pilot VOs and others has to be made. Oliver 0 0
VO management via YAIM If the web tool is not ready, make sure info for as many VOs as possible is shipped with yaim, ensure GEANT4 and UNOSAT are there. Oliver 0 0
VO management via YAIM Default VOs are to be 4 LHC experiments + biomed + dteam + MIS (for monitoring) Oliver 0 0
Monitoring Remove gridftp monitor from the CE. Laurence 0 0
RB Sandbox Add a smart mechanism that limits the output sandbox size on the RB. We have a mechanism for the input sandbox already in place. The recently observed jobs with stderr files > 2GB can bring down any RB. The mechanism should work like this:
The limit has to be configurable
Sort all the files in reverse order by size
Transfer all the files that fit into the limit
For the remaining files transfer the first and last 100K and a note on the original size of the files
Maarten 0 0
VDT/Globus Upgrade to more recent version of VDT and Globus.
This should be synchronized with OSG and gLite
Maarten 0 0
R-GMA Inclusion of the latest R-GMA ( gLite 1.5 ) x 0 0

Priority: 1 is low, 5 is high.

The multi VO FTS service will be released independently using the gLite distribution.

Still on the wishlist, but not under consideration for LCG-2_7_0

  1. All the gLite components ready for the road in time. This depends on how well gLite 1.3(4)(5) do on the pre production service.
    • Goal: gLite WLM (RB,CE, and UI)
    • Modifications needed for interoperation (gLite WLM (broker and UI) + LCG2 CEs and WNs,)
    • WLM (broker):
      • Information provider
      • Verification that broker info file is there
      • Jobwrapper scripts
      • Logging and Bookkeeping and monitoring to R-GMA
      • Tarballisation of gLite UI
      • State externalization (can be done later, needed only for building a resilient system)
  2. MyProxy server consolidation
    • This service is becoming more and more used and we need to find ways to manage the access.
    • A VO for services could be introduced with roles to allow fine grained control. There is currently a limitation to 16 roles/VO. We should check how this can be changed. There is no mapping involved, the roles are just there to simplify the control configuration
  3. MySQL node type
    • All services, except R-GMA mon box and the RB can use one DB located on one node.
  4. Add squid as a service for ATLAS and CMS on the T1s, motivation has been provided by Rod Walker.
    • We on Atlas have several use cases for sites having a caching web proxy. I would think other VO's may make use of it too.
      • Ad hoc user code/data distibution.
      • Conditions/geometry/calibration data either flatfile or FronTier .
      • Proxy helps with private network clusters.
    • The installation needed would be standard except for an increase in the max size of the cached objects. An env LCG_HTTP_PROXY could be set, and only if this is copied to HTTP_PROXY will squid be used, so it would have no impact on other users.
    • Configuration
      • max cached object of say 2GB
      • cache size of 50GB for one VO
      • default cache turnover policy
      • location, location, location. Maybe SE?
    • NOTE: The above is based on input from Rod Walker
  5. Turn on authentication for RGMA
  6. Pilot jobs A service on the CE is needed to allow pilot jobs on the WN to announce change of user for the running job. lhcb and other VOs submit pilot jobs that pull user jobs from their own task queues. To allow the site to make a final decision on who can run and produce proper traces the framework of the VO needs to contact the gate keeper and request it to accept or reject the change of user.
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2007-02-14 - FlaviaDonno
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback