Week of 061009

Open Actions from last week: After report from FNAL VOMRS the ORACLE bug-fix upgrades can go ahead Monday. Schedule is to start at 10:00.

Chair: H.Renshall

Gmod: D.Bosio

Smod: T.Kleinwort


Log: Atlas fts transfers stopped 02.00 Sunday due to mbranco myproxy expiry (fixed about 12.00). Cms fts transfers stopped at same time due to RAC db server reboot (to be confirmed - cms agents to be restarted).

New Actions: Check /tmp inodes on RBs. Roll out Castor fix for stat of files GT 2 GB. Cleanup atlprod. Understand rgma failures on monb001 (new python or new glite 3.0.2 ?). Try to get two new global bdii servers, two new LCG CE's (in addition to putting ce107 in production) and one glite CE.

Discussion: J.Eldik reported srm bug found Friday where wrong stat is used for files GT 2 GB. Causes phedex to retry for such files. Also atlprod filled up again Friday - Atlas will come with list of files to be deleted. M.Ernst reported RAL migrated to Castor2 this weekend and were able to rapidly catch up their cms backlog by running transfers at 80 MB/sec.



New Actions: ce107 to be setup. Copy experiment software tags from /opt/edg/var/info on ce105. Prepare 2 new bdii. Decide wether to merge global and experiment bdii with them. Implement cleaning of /tmp (also to workaround insufficient inodes) on glite RBs.

Discussion: Atlas reconstruction jobs failing over weekend as too many per node were filling /pool. Advised by Uli to restrict numbers based on /pool size and will scratch staged in raw data immediately.



New Actions: ce107 firewall ports to be opened. Decision on how to deploy new bdii107 and 108 to be taken.

Discussion: castor srm was changed in CERN, CNAF and ASGC to allow for files > 2 GB. CMS report their relatively low FTS rates are an application issue of tuning the physics channels and the rates will be rising.



New Actions:

Discussion: Di has provided a new rpm to clean up /tmp and /var/tmp on the glite RB's. A cpu greedy process causing ce102 to keep diappearing from our bdii is being looked at. The new bdii107 and 108 are being prepared.



New Actions: bdii107 and 8 firewall changes will be made Monday after which they can be put in production, one to exp-bdii and the other to lcg-bdii. Move atlas to rb103 Monday morning.

Discussion:CMS will be starting glite bulk job submission on Monday. CERN bdii overloads are causing SAM tests to fail at other sites.

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2006-10-13 - HarryRenshall
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback