-- JamieShiers - 22 Jan 2008

DB-related

  1. Consistent use of GOCDB/CIC portal for announcing DB service interventions - a further reminder has been sent.
  2. ATLAS streaming problems - was this logged / reported through to dashboards? Alarmed? - Follow-up in progress.
  3. Lack of metrics for February CCRC'08 run - to be discussed and agreed at Jan 28 CCRC / Jan 29 MB.
  4. ATLAS observed that the Oracle client on linux changes the settings for the FPU rounding precision, which affects other calculations (not related to Oracle data) in the same process. Even though the rounding differences are minor (64-bit rounding instead of the default 80-bit) this can leads for boundary cases to visible differences between runs with and without Oracle. An immediate workaround for the problem has been found (setting $export ORA_FPU_PRECISION=EXTENDED) and has been implemented in the ATLAS offline configuration. We are now working with Oracle support to get this FPU manipulation removed from the client with highest priority.

FTS-related

  1. It would be desirable to allow the user to specify a space-token on SRM.prepareToGet – this makes Castor’s pool selection algorithm more efficient. Currently the user may only specify a token on SRM.prepareToPut (i.e. the destination) in the FTS job.
  2. SRM Copy between mixed SRM v1 / v2 sites does not seem to be reliable. FTS sets the PERMANENT flag in the SRM copy request, but this does not always seem to be propagated by the SRM running the copy; this requires some investigation between FTS and the SRM vendors (specifically dCache). The problem is fixed in dCache 1.8.0-12 (patch 1 + patch 2).
  3. NDGF have been working with DESY on gridFTP2 support for dCache – this substantially improves the performance when using FTS in URLCOPY mode. NDGF also have patches to the VDT gridFTP client (used by FTS) to allow it to use the gridFTP2 protocol. We should investigate the possibility of getting those patches into the main VDT release used by gLite.
  4. DPM/dCache SRMcopy doesn't work due to incompatibility between space tokens (dCache only wants an integer, spec says string, DPM provides a UUID)
  5. FTS SRM space 'user description' bug. FTS gets space token for user description from source and then tries to use it on the dest (should get it on the dest, not the source). This affects only SRMCOPY channels in push mode. Patch 1671: In certification 07/02/08
  6. Proxy corruption in FTS. This is a race condition when more than one FTS submit client tries to delegate a credential to the service for the same user; if the condition is triggered, a corrupted proxy stays in the FTS database for around 8 hours, leading to total service downtime for that user. More details and the suggested workarounds are described in ServiceIssuesFtsProxyCorruption.

gfal/lcg_utils

  1. Patch 1641 might not make it through cert/pps in time for wide-spread deployment. 23/01/08 - certified: will move to PPS tomorrow

SRM-related

  1. bulk srmLs and srmRm requests - performance and scalability concerns. ATLAS expects to delete 100k files per week and T1 during data taking
  2. problem of choosing the correct WAN/LAN pool while performing a BringOnline operation in dCache.
    Impact - reprocessing from tape will be affected
  3. PrepareToGet requests with/without tokens: follow-up on proposal from Flavia/Maarten
  4. Various problems with srmCopy with srm v2.2
    dCache developers recommend to leave FTS channels configured in srmcopy mode because of performance issues. There will be no difference between urlcopy and srmcopy only when clients will be able to fully exploit the features of GridFTP2.
  5. Sites have been asked to publish specific information about the size of spaces. (Or is this a site issue?)
  6. When we talked about read/write permissions for dCache pools, it was mentioned that each user can read from those pools and that if a user wants to copy a file to her own Tier-3 disk area dCache will realise that this cannot be done from this LAN pool and first copy to a WAN pool from where this is possible. But this could be very dangerous because this would allow any user to copy files from tape at any dCache T1. Is there a way to avoid this?
    A more general question is that no ACLs are foreseen on spaces for dCache. A reserved space can be used by any user provided that he/she can write in the path where the file is and that there are pools allocated to that space.

VOMS-related

  1. On-going service reliability concerns and medium-term (if that is the right term for immediately after the February run of CCRC'08) support concerns.
  2. Problem hitting VOMS replication to BNL: Remi has applied the patch to Axis-1.2, he waits now for a confirmation form ATLAS that this actually fixes the problem. (See comment 10 in bug : https://savannah.cern.ch/bugs/?32473 ) A proper solution, move the Axis-1.4, will not be available for the CCRC (all clients have to be re-tested ). I agree with Michael that we should test VOMS in addition against standard clients. We will look into this when we certify the next VOMS patch
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2008-02-20 - GavinMcCance
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback