Daily Log Archive 2007
These are daily logs
TransferOperationsDailyLog.
Old daily logs can be found at
TransferOperationsDailyLogArchived.
20 December 2007
- ASCC All service of castor2 will be down from 01:30AM(UTC) to 10:30AM(UTC),2007-12-20
- SARA Emergency Maintenance today 20/12/07 - CERN-SARA link
- 0 tickets opened, 0 moved, 0 solved
19 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 170 to 410 Mb/s, averaging around 280MB/s.
- The most active sites are FNAL and FZK
- Mostly traffic from CMS
18 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 0 to 400 Mb/s, averaging around 120MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS
17 December 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 100 to 450 Mb/s, averaging around 280MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS and less from Alice
30417
- (2007.12.14) FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: atlas
[313] SOURCE during PREPARATION phase: [REQUEST_TIMEOUT] failed to prepare source file in 180 seconds
Reason:This problem may be solved by the modifications in PoolManager.conf
16 December 2007
- Transfer ranging from 90 to 150 Mb/s, averaging around 115MB/s.
- The most active sites are ASCC and IN2P3
- Mostly traffic from CMS
15 December 2007
- Transfer ranging from 90 to 230 Mb/s, averaging around 150MB/s.
- The most active sites are FNAL and IN2P3
- Mostly traffic from CMS
14 December 2007
- 1 tickets opened, 1 moved, 1 solved
- Transfer ranging from 190 to 660 Mb/s, averaging around 350MB/s.
- The most active sites are FNAL and IN2P3
- Mostly traffic from CMS
30417
- (2007.12.14) FTS transfer - SRM problem on site
IN2P3-CC
(
moved to 2007.12.17)
30340
- (2007.12.13) FTS transfer - SRM problem on site SARA-MATRIX
SOLVED
vo: atlas, cms
[306] failed to prepare Destination file in 180 seconds
13 December 2007
- 1 tickets opened, 0 moved, 0 solved
- Transfer ranging from 10 to 420 Mb/s, averaging around 160MB/s.
- The most active sites are FNAL and IN2P3
- Mostly traffic from CMS
30340
- (2007.12.13) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.12.14)
12 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 130 to 420 Mb/s, averaging around 250MB/s.
- The most active sites are FNAL and INFN
- Mostly traffic from CMS
11 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 200 to 550 Mb/s, averaging around 310MB/s.
- The most active sites are FNAL, FZK and IN2P3
- Mostly traffic from CMS
10 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 200 to 450 Mb/s, averaging around 300MB/s.
- The most active sites are FNAL, ASCC and IN2P3
- Mostly traffic from CMS
9 December 2007
- Transfer ranging from 10 to 580 Mb/s, averaging around 380MB/s.
- The most active sites are ASCC, FNAL, IN2P3 and INFN-T1
- Mostly traffic from CMS
8 December 2007
FZK Because of application of the latest Oracle patch the 3D ATLAS and LHCb databases at
GridKa/FZK will be "at risk" on Saturday 08/12/2007 from 10:00 UTC to 14:00 UTC.
- Transfer ranging from 260 to 480 Mb/s, averaging around 380MB/s.
- The most active sites are ASCC, FNAL, IN2P3 and INFN-T1
- Mostly traffic from CMS
7 December 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 120 to 530 Mb/s, averaging around 270MB/s.
- The most active sites are ASCC and FNAL
- Mostly traffic from CMS
6 December 2007
- 1 tickets opened, 0 moved, 1 solved
- TRIUMF Planned/Scheduled 2hr Outage TRIUMF-LCG2 - FTS Thurs Dec 6 14:00 (local)
- Transfer ranging from 60 to 340 Mb/s, averaging around 150MB/s.
- The most active sites are ASCC and PIC
- Mostly traffic from CMS
30106
- (2007.12.06) FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: cms
[90] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] call. Error is RequestFileStatus#-[] failed with error:[ at Wed Feb 21 12:18:44 CET 2007 state Failed : file not found : path [path] not found
------
[313] SOURCE during PREPARATION phase: [REQUEST_TIMEOUT] failed to prepare source file in 180 seconds
Reason: no files on SE, under investigation
5 December 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 80 to 230 Mb/s, averaging around 140MB/s.
- The most active sites are INFN-T1 and ASCC
- Mostly traffic from CMS
30027
- (2007.12.04) FTS transfer - SRM problem on site FZK-LCG2
SOLVED
vo: alice, atlas, cms, lhcb
[313] SOURCE during PREPARATION phase: [REQUEST_TIMEOUT] failed to prepare source file in 180 seconds
Reason: Files has been written in June and at this time they had a bug in their tape connection script which very seldom created a data loss. They will now check if any more files are affected, however this failure can not anymore occur for newly written files.
4 December 2007
- 1 tickets opened, 0 moved, 0 solved
- Transfer ranging from 150 to 370 Mb/s, averaging around 270MB/s.
- The most active sites are INFN-T1, ASCC and BNL
- Mostly traffic from CMS
30027
- (2007.12.04) FTS transfer - SRM problem on site FZK-LCG2
(
moved to 2007.12.05)
3 December 2007
- 0 tickets opened, 0 moved, 0 solved
- RAL-LCG2 will be unavailable from 9am until 5pm UTC; All Castor and dCache SRMs will be unavailable, during this time dCache will be upgraded to version 1.8. On the preceding Friday afternoon (30th November), the batch queues will be closed and any work which is not expected to finish before the start of the downtime will not be started
- Transfer ranging from 150 to 520 Mb/s, averaging around 370MB/s.
- The most active sites are INFN-T1, FZK and PIC
- Mostly traffic from CMS
2 December 2007
- Transfer ranging from 150 to 300 Mb/s, averaging around 210MB/s.
- The most active sites are INFN and PIC
- Mostly traffic from CMS
1 December 2007
- Transfer ranging from 140 to 300 Mb/s, averaging around 220MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
30 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 140 to 240 Mb/s, averaging around 180MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
29 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 160 to 270 Mb/s, averaging around 200MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
28November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 170 to 420 Mb/s, averaging around 270MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
27 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 180 to 470 Mb/s, averaging around 280MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
26 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 100 to 250 Mb/s, averaging around 180MB/s.
- The most active sites are FZK and PIC
- Mostly traffic from CMS
25 November 2007
- Transfer ranging from 110 to 200 Mb/s, averaging around 150MB/s.
- The most active sites are RAL and ASCC
- Mostly traffic from CMS
24 November 2007
- Transfer ranging from 110 to 200 Mb/s, averaging around 150MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS
23 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 110 to 180 Mb/s, averaging around 140MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS
22 November 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 190 to 450 Mb/s, averaging around 300MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS
21 November 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 70 to 210 Mb/s, averaging around 130MB/s.
- The most active sites are FZK and IN2P3
- Mostly traffic from CMS
29085
- (2007.11.15) FTS transfer - SRM problem on site PIC
SOLVED
vo: atlas, lhcb
[309] FINAL:DESTINATION: failed to contact on remote SRM [httpg://srm-disk.pic.es:8443/srm/managerv1]. Givin' up after 3 tries
20 November 2007
- 0 tickets opened, 1 moved, 0 solved
- RAL-LCG2 CMS CASTOR downtime on 2007-11-20 08:30-17:00
- Transfer ranging from 30 to 240 Mb/s, averaging around 130MB/s.
- The most active sites are SARA and FZK
- Mostly traffic from CMS
29085
- (2007.11.15) FTS transfer - SRM problem on site PIC
(
moved to 2007.11.21)
19 November 2007
- 0 tickets opened, 2 moved, 1 solved
- Reminder: RAL-LCG2 ATLAS CASTOR downtime on 2007-11-19 08:30-17:00
- Transfer ranging from 20 to 350 Mb/s, averaging around 200MB/s.
- The most active sites are FZK and FNAL
- Mostly traffic from CMS
29085
- (2007.11.15) FTS transfer - SRM problem on site PIC
(
moved to 2007.11.20)
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
SOLVED
vo: cms, atlas
[315] SOURCE during PREPARATION phase: [CONNECTION] service timeout during [SrmGet];
Reason: this is due to the high load of castor v1 stager, and we force disabling the migrator of stg, that reducing the avg load from 50 to less than 5. i tried simple lcg-cp from castor v1 endpt, and didnt have pb reading the file out from backend disk servers. further testing with fts data transfer submission via asgc fts channel, and i also not encounter any err replicating the data from castor v1 to local dpm server.
18 November 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 50 to 360 Mb/s, averaging around 230MB/s.
- The most active sites are FZK, FNAL and ASCC
- Mostly traffic from CMS
17 November 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 260 to 680 Mb/s, averaging around 500MB/s.
- The most active sites are FZK, IN2P3, FNAL and ASCC
- Mostly traffic from CMS
16 November 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 440 to 620 Mb/s, averaging around 550MB/s.
- The most active sites are FZK, IN2P3, FNAL and PIC
- Mostly traffic from CMS
29085
- (2007.11.15) FTS transfer - SRM problem on site PIC
vo: atlas, lhcb
[309] FINAL:DESTINATION: failed to contact on remote SRM [httpg://srm-disk.pic.es:8443/srm/managerv1]. Givin' up after 3 tries
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
vo: cms, atlas
[315] SOURCE during PREPARATION phase: [CONNECTION] service timeout during [SrmGet];
15 November 2007
- 1 tickets opened, 2 moved, 1 solved
- Transfer ranging from 380 to 620 Mb/s, averaging around 490MB/s.
- The most active sites are IN2P3, FNAL and ASCC
- Mostly traffic from CMS
29085
- (2007.11.15) FTS transfer - SRM problem on site PIC
vo: atlas, lhcb
[309] FINAL:DESTINATION: failed to contact on remote SRM [httpg://srm-disk.pic.es:8443/srm/managerv1]. Givin' up after 3 tries
28984
- (2007.11.13) Transfer problems between sites CERN-PROD and FZK-LCG2
SOLVED
vo: cms
[11] an end-of-file was reached (possibly the destination disk is full)
-----------
[12] 451 Local resource failure: malloc: Cannot allocate memory
Reason: Since the update to dCache1.8 last week we see problems with the SRM/SpaceManager and with pnfs. Tuesday afternoon SRM didn't work anymore and a restart of SRM was needed. We are working on these problems together with the dCache developpers.
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
vo: cms, atlas
[315] SOURCE during PREPARATION phase: [CONNECTION] service timeout during [SrmGet];
14 November 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 300 to 540 Mb/s, averaging around 360MB/s.
- The most active sites are IN2P3, FNAL and ASCC
- Mostly traffic from CMS and less Atlas
28984
- (2007.11.13) Transfer problems between sites CERN-PROD and FZK-LCG2 (
moved to 2007.11.15)
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.15)
13 November 2007
- 1 tickets opened, 1 moved, 0 solved
- Transfer ranging from 300 to 600 Mb/s, averaging around 450MB/s.
- The most active sites are IN2P3, FNAL and ASCC
- Mostly traffic from CMS and less Atlas
28984
- (2007.11.13) Transfer problems between sites CERN-PROD and FZK-LCG2 (
moved to 2007.11.14)
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.14)
12 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 300 to 700 Mb/s, averaging around 450MB/s.
- The most active sites are IN2P3, FNAL and ASCC
- Mostly traffic from CMS and less Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.13)
11 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 270 to 330 Mb/s, averaging around 350MB/s.
- The most active sites are ASCC, IN2P3 and FNAL
- Mostly traffic from CMS and less Atlas
10 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 330 to 680 Mb/s, averaging around 470MB/s.
- The most active sites are ASCC and IN2P3
- Mostly traffic from CMS and less Atlas
9 November 2007
- 0 tickets opened, 2 moved, 1 solved
- Transfer ranging from 450 to 930 Mb/s, averaging around 550MB/s.
- The most active sites are ASCC, IN2P3, TRIUMF and SARA
- Mostly traffic from CMS and Atlas
28778
- (2007.11.08) Transfer problems between sites CERN-PROD and INFN-T1
SOLVED
vo: cms, atlas
[3] FINAL:TRANSFER: globus_l_ftp_control_read_cb: Error while searching for end of reply
Reason: Possibly related to this bug: https://savannah.cern.ch/bugs/index.php?29930
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.12)
8 November 2007
- 1 tickets opened, 1 moved, 0 solved
- Transfer ranging from 300 to 830 Mb/s, averaging around 550MB/s.
- The most active sites are TRIUMF, IN2P3 and SARA
- Mostly traffic from CMS and Atlas
28778
- (2007.11.08) Transfer problems between sites CERN-PROD and INFN-T1 (
moved to 2007.11.08)
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2 (
moved to 2007.11.08)
7 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 190 to 720 Mb/s, averaging around 430MB/s.
- The most active sites are TRIUMF, IN2P3 and FZK
- Mostly traffic from CMS and Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.07)
6 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 300 to 730 Mb/s, averaging around 480MB/s.
- The most active sites are ASCC, FNAL and PIC
- Mostly traffic from CMS and Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.07)
5 November 2007
- 0 tickets opened, 1 moved, 0 solved
- There will be a scheduled maintenance from 18:00 until 19:30 on our network infrastructure; Within this time frame the sites SARA-MATRIX and SARA-LISA will not be available for approximately 15 minutes;
- Planned Outage for TRIUMF-LCG2 FTS - Mon Nov 5th 14:00;
- Transfer ranging from 500 to 1320 Mb/s, averaging around 860MB/s.
- The most active sites are BNL, FNAL, INFN-T1 and RAL
- Mostly traffic from CMS and Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.06)
4 November 2007
- Transfer ranging from 320 to 840 Mb/s, averaging around 600MB/s.
- The most active sites are BNL and RAL
- Mostly traffic from CMS and Atlas
3 November 2007
- Transfer ranging from 500 to 980 Mb/s, averaging around 750MB/s.
- The most active sites are BNL, IN2P3, TRIUMF and FNAL
- Mostly traffic from CMS and Atlas
2 November 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 600 to 1220 Mb/s, averaging around 800MB/s.
- The most active sites are BNL, IN2P3, TRIUMF and FNAL
- Mostly traffic from CMS and Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
vo: cms, atlas
[315] SOURCE during PREPARATION phase: [CONNECTION] service timeout during [SrmGet];
1 November 2007
- 1 tickets opened, 0 moved, 0 solved
- Transfer ranging from 580 to 1000 Mb/s, averaging around 850MB/s.
- The most active sites are BNL, IN2P3, TRIUMF and FNAL
- Mostly traffic from CMS and Atlas
28546
- (2007.11.01) SRM problem on site TAIWAN-LCG2
(
moved to 2007.11.02)
31 October 2007
- 0 tickets opened, 0 moved, 0 solved
- Network problem at GridKa; traffic to and from GridKa is affected for all VO's
- Transfer ranging from 680 to 1210 Mb/s, averaging around 950MB/s.
- The most active sites are BNL and FNAL
- Mostly traffic from CMS and less from Atlas
30 October 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 300 to 1100 Mb/s, averaging around 520MB/s.
- The most active sites are FNAL and INFN-T1
- Mostly traffic from CMS and less from Atlas
29 October 2007
- 0 tickets opened, 0 moved, 0 solved
- *NDGF*The week starting with monday the 29th NDGF will upgrade the dCache installation to 1.8. All data and from srm.ndgf.org will be unavailable.
- Transfer ranging from 20 to 550 Mb/s, averaging around 320MB/s.
- The most active sites are FNAL and IN2P3
- Mostly traffic from CMS
28 October 2007
- Transfer ranging from 150 to 350 Mb/s, averaging around 250MB/s.
- The most active sites are INFN-T1 and FNAL
- Mostly traffic from CMS
27 October 2007
- Transfer ranging from 30 to 420 Mb/s, averaging around 250MB/s.
- The most active sites are INFN-T1
- Mostly traffic from CMS less from LHCb
26 October 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 200 to 560 Mb/s, averaging around 400MB/s.
- The most active sites are FNAL, RAL and IN2P3
- Mostly traffic from CMS
25 October 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 300 to 550 Mb/s, averaging around 420MB/s.
- The most active sites are INFN and IN2P3
- Mostly traffic from CMS
24 October 2007
- 0 tickets opened, 0 moved, 0 solved
- SARA Due to a problem with our SAN which holds the dcache pnfs database we were forced to powercycle the storage system with unscheduled downtime of our dcache SE srm.grid.sara.nl as a result. At the moment we are starting everything up again.
- Transfer ranging from 350 to 710 Mb/s, averaging around 520MB/s.
- The most active sites are ASGC, FNAL and INFN-T1
- Mostly traffic from CMS
23 October 2007
- 0 tickets opened, 0 moved, 0 solved
- Castor-RAL cms/atlas/lhcb downtime from 9.30AM to 1.00PM due to the outstanding security Oracle patches and kernel upgrades
- The RAL-LCG2 FTS, lcgfts.gridpp.rl.ac.uk, will be unavailable between 08:30 and 11:30 UTC on October 23 for critical Oracle and kernel updates. The service will be drained of Active transfers from 07:30 UTC
- Due to a critical Oracle patch updates, the following services hosted by IN2P3-CC will be considered "At Risk" between 9:00 and 13:00 CEST (7:00-11:00 UTC): - Biomed central LFC : cclcglfcli02.in2p3.fr (alias lfc-biomed.in2p3.fr) - ATLAS local LFC : lfc-atlas.in2p3.fr - FTS servers : cclcgftsprod.in2p3.fr and cclcgftsprod01.in2p3.fr
- Transfer ranging from 150 to 570 Mb/s, averaging around 240MB/s.
- The most active sites are INFN-T1 and FNAL
- Mostly traffic from CMS
22 October 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 200 to 1700 Mb/s, averaging around 800MB/s.
- The most active sites are INFN-T1, RAL and PIC
- Mostly traffic from Atlas and less from CMS
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
SOLVED
vo: cms
[356] DESTINATION during PREPARATION phase: [REQUEST_FAILURE] [SRM_FAILURE] [SrmPut] failed: SOAP-ENV:Client - initFileStatuses: cannot create path [path]Permission denied;
21 October 2007
- Transfer ranging from 800 to 1550 Mb/s, averaging around 1250 MB/s.
- The most active sites are RAL, FZK and BNL
- Mostly traffic from Atlas and less from CMS
20 October 2007
- Transfer ranging from 1000 to 1580 Mb/s, averaging around 1250 MB/s.
- The most active sites are IN2P3, BNL, FZK, INFN, RAL and TRIUNF
- Mostly traffic from Atlas and less from CMS
19 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 1210 to 1510 Mb/s, averaging around 1400 MB/s.
- The most active sites are IN2P3, BNL, FZK, INFN, RAL and TRIUNF
- Mostly traffic from Atlas and less from CMS
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
vo: cms
[356] DESTINATION during PREPARATION phase: [REQUEST_FAILURE] [SRM_FAILURE] [SrmPut] failed: SOAP-ENV:Client - initFileStatuses: cannot create path [path]Permission denied;
18 October 2007
- 0 tickets opened, 1 moved, 0 solved
- dCache problem @ TRIUMF-LCG2
- Transfer ranging from 850 to 1520 Mb/s, averaging around 1150 MB/s.
- The most active sites are IN2P3, BNL, FZK, INFN, RAL and TRIUNF
- Mostly traffic from Atlas and less from CMS
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.18)
17 October 2007
- 0 tickets opened, 1 moved, 0 solved
- an intervention on tape servers; therefore access to tape via Castor won't be operative at CNAF-INFN on October 17 from 11:00 to 13:00 (Rome Time), from 08:00 to 11:00 (UTC Time);
- Transfer ranging from 820 to 1120 Mb/s, averaging around 950 MB/s.
- The most active sites are IN2P3, BNL and INFN
- Mostly traffic from CMS and Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.18)
16 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 400 to 1600 Mb/s, averaging around 900 MB/s.
- The most active sites are FNAL, FZK, IN2P3, INFN-T1 and RAL
- Mostly traffic from CMS and Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.17)
15 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 120 to 710 Mb/s, averaging around 400 MB/s.
- The most active sites are INFN-T1, IN2P3, FNAL and FZK
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.16)
14 October 2007
- Transfer ranging from 120 to 1020 Mb/s, averaging around 540 MB/s.
- The most active sites are INFN-T1, RAL, IN2P3
- Mostly traffic from CMS and less from Atlas
13 October 2007
- Transfer ranging from 170 to 740 Mb/s, averaging around 350 MB/s.
- The most active sites are INFN-T1, RAL, IN2P3 and NDGF
- Mostly traffic from CMS and Atlas
12 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 400 to 800 Mb/s, averaging around 580 MB/s.
- The most active sites are ASGC, FNAL, INFN-T1, PIC and RAL
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.15)
11 October 2007
- 0 tickets opened, 2 moved, 1 solved
- TRIUMF FTS database outage for 1 hour on Thursday October 11 at 2pm (Vancouver Time) or 9pm (UTC).
- INFN an intervention on CASTOR stager database backend is needed in order to fix some performance issues. Therefore CASTOR will be down at CNAF-INFN on October 11, from 9 to 12:30.
- BNL 11.10.2007 15:00 - 11.10.2007 18:00 Affected databases - 3D Oracle cluster: orcl.bnl.gov ATLAS Conditions database.
- Transfer ranging from 220 to 840 Mb/s, averaging around 480 MB/s.
- The most active sites are RAL and INFN
- Mostly traffic from CMS and less from Atlas
27815
- (2007.10.10) FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
TRANSFER during TRANSFER phase: [GRIDFTP] the server sent an error response: 451 451 Non-null return code from [>hpc2n_umu_se_501@miffo_hpc2n_umu_seDomain:*@miffo_hpc2n_umu_seDomain:*@dCacheDomain:SrmSpaceManager@srm-srm2Domain:*@srm-srm2Domain:*@dCacheDomain] with error No space left on device
Reason: ATLAS stress tests regularly fills this srm2.2 test instance up. We can expect this to fill up, but I'm working on a tmpreaper setup to remove all files older than a couple of hours, so that the tests can continue. tmpreaper is now installed and problem hopefully fixed.
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.12)
10 October 2007
- 1 tickets opened, 1 moved, 0 solved
- CERN-PROD The CASTORATLAS MSS at CERN will be upgraded to the latest Castor software version.
The intervention will start at 09:00 CEST, and should last no more than 4 hours.
- Transfer ranging from 170 to 600 Mb/s, averaging around 300 MB/s.
- The most active sites are FNAL and INFN
- Mostly traffic from CMS
27815
- (2007.10.10) FTS transfer - SRM problem on site NDGF-T1
(
moved to 2007.10.11)
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.11)
9 October 2007
- 0 tickets opened, 1 moved, 0 solved
- INFN the INFN-ROMA1 site will be down tomorrow 9 Oct 2007, from 9:00 to 20:00, due to an urgent intervention on the air cooling system
- Transfer ranging from 100 to 810 Mb/s, averaging around 500 MB/s.
- The most active sites are FNAL and RAL
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.10)
8 October 2007
- 0 tickets opened, 1 moved, 0 solved
- BNL BNL USATLAS 3D Conditions Database Transparent intervertion from 1400 to 1700 UTC on Oct/08/2007
- RAL RAL and Fermilab will commence routing SRM/SE over the OPN on Monday 8th October at 14:00 UCT.
- Transfer ranging from 380 to 1080 Mb/s, averaging around 650 MB/s.
- The most active sites are FNAL and RAL
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.09)
7 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 300 to 740 Mb/s, averaging around 490 MB/s.
- The most active sites are FNAL, RAL, INFN and PIC
- Mostly traffic from CMS and less from Atlas
6 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 270 to 530 Mb/s, averaging around 360 MB/s.
- The most active sites are INFN-T1, FNAL, RAL and IN2P3
- Mostly traffic from CMS and less from Atlas
5 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 170 to 640 Mb/s, averaging around 370 MB/s.
- The most active sites are INFN-T1, FNAL, RAL and IN2P3
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
vo: cms
[356] DESTINATION during PREPARATION phase: [REQUEST_FAILURE] [SRM_FAILURE] [SrmPut] failed: SOAP-ENV:Client - initFileStatuses: cannot create path [path]Permission denied;
4 October 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 190 to 630 Mb/s, averaging around 370 MB/s.
- The most active sites are INFN-T1, FNAL, RAL and IN2P3
- Mostly traffic from CMS and less from Atlas
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.05)
3 October 2007
- 0 tickets opened, 3 moved, 2 solved
- Transfer ranging from 50 to 430 Mb/s, averaging around 210MB/s.
- The most active sites are FNAL, RAL and ASCC
- Mostly traffic from CMS
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.04)
27477
- (2007.10.02) Transfer problems between sites TAIWAN-LCG2 and CERN-PROD
SOLVED
vo: atlas, cms
[107] the server sent an error response: 451 451 rfio read failure: Connection closed by remote end
Reason: the transfer might affect by the massive srm requests to c2 srm, yesterday, but cms transfer ramp up and close to 1k start from 19 utc, and should now back to normal after fixing the fts transfer err and extend also the timeout limit to 7.2k sec for some of the channels fail with long elapse time for cms lt data file xfer. so far, from cms phedex, we have 1.1 MB/s effective throughput
27481
- (2007.10.02) Transfer problems between sites FZK-LCG2 and CERN-PROD
SOLVED
vo: atlas, cms
[90] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] call. Error is RequestFileStatus#-[] failed with error:[ [date] state Failed : file not found : path [path] not found
Reason: I will investigate the problem (together with GridKA admins) and remove those replicas from CMS bookkeeping system so that the transfers attempts against those files stop.
2 October 2007
- 3 tickets opened, 0 moved, 0 solved
- Unscheduled downtime for IN2P3-CC sites due to power outage; IN2P3-CC batch system will be off until tomorow noon
- Transfer ranging from 90 to 280 Mb/s, averaging around 180MB/s.
- The most active sites are FNAL and INFN-T1
- Mostly traffic from CMS
27508
- (2007.10.02) "Permission denied" problem on site INFN-T1
(
moved to 2007.10.03)
27477
- (2007.10.02) Transfer problems between sites (
moved to 2007.10.03)
27481
- (2007.10.02) Transfer problems between sites FZK-LCG2 and CERN-PROD (
moved to 2007.10.03)
1 October 2007
- 0 tickets opened, 0 moved, 0 solved
- srm.ndgf.org unscheduled downtime
- Transfer ranging from 110 to 300 Mb/s, averaging around 170MB/s.
- The most active sites are FNAL, INFN-T1 and FZK
- Mostly traffic from CMS
30 September 2007
- Transfer ranging from 110 to 290 Mb/s, averaging around 160MB/s.
- The most active sites are ASCC, FNAL, FZK, IN2P3, INFN-T1 and PIC
- Mostly traffic from CMS
29 September 2007
- Transfer ranging from 70 to 290 Mb/s, averaging around 170MB/s.
- The most active sites are FNAL and INFN-T1
- Mostly traffic from CMS
28 September 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 70 to 220 Mb/s, averaging around 160MB/s.
- The most active sites are ASCC, FNAL, FZK, LIPLisbon and PIC
- Mostly traffic from CMS and less from LHCb
27344
- (2007.09.27) Transfer problems between sites FZK-LCG2 and CERN-PROD
SOLVED
vo: atlas, cms
[8] the server sent an error response: 425 425 Can't open data connection. timed out() failed;
[90] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] call. RequestFileStatus#-2143772586 failed with error:[ at Thu Sep 27 11:52:32 CEST 2007 state Failed : file not found : path /pnfs/gridka.de/atlas/disk-only/dq2/misal1_mc12_V1/RDO/misal1_mc12_V1.005001.pythia_minbias.digit.RDO.v12000605_tid010684/RDO.010684._78667.pool.root.1 not found;
[304] empty file size returned;
Reason: You tried to read files which don't exist in our dCache:
ls: /pnfs/gridka.de/cms/Prod/store/PhEDEx_LoadTest07_FZK-TAPE3/LoadTest07_FZK_B1: No such file or directory
ls: /pnfs/gridka.de/cms/Prod/store/PhEDEx_LoadTest07_FZK-TAPE3/LoadTest07_FZK_08: No such file or directory
ls: /pnfs/gridka.de/atlas/disk-only/dq2/misal1_mc12_V1/RDO/misal1_mc12_V1.005001.pythia_minbias.digit.RDO.v12000605_tid010684/RDO.010684._78667.pool.root.1: No such file or directory
27 September 2007
- 1 tickets opened, 0 moved, 0 solved
- Network outage at SARA; After the network outage there still seem to be internal networking problems which affects at least our storage nodes. I'll switch of the FTS channels.
- Transfer ranging from 80 to 270 Mb/s, averaging around 150MB/s.
- The most active sites are FZK, ASCC and RAL
- Mostly traffic from CMS
27344
- (2007.09.27) Transfer problems between sites FZK-LCG2 and CERN-PROD (
moved to 2007.09.28)
26 September 2007
- 0 tickets opened, 0 moved, 0 solved
- CERN-PROD (ALICE) On Wednesday, September 26 2007, the CASTORALICE MSS at CERN will be upgraded to the latest Castor software version. The intervention will start at 09:00 CEST, and should last no more than 4 hours. During the time of the intervention no new requests will be handled.
- Transfer ranging from 150 to 310 Mb/s, averaging around 170MB/s.
- The most active sites are FNAL, FZK, IN2P3 and PIC
- Mostly traffic from CMS and less from LHCb
25 September 2007
- 0 tickets opened, 0 moved, 0 solved
- RAL Due to the kernel upgrade of the LUGH & OGMA database machines, these databases should be considered "at risk" Between 9:00 and 11:00 BST Tuesday 25th Sept. Services at risk are: "lugh" (LCG 3D Tier-1 for LHCb), "ogma" (LCG 3D Tier-1 for ATLAS)
- RAL The RAL-LCG2 FTS service will be interrupted between 9am and 10am UK time Tuesday September 25th to enable a kernel update for the backend database machine. Transfers will begin draining out at 8am UK time.
- Transfer ranging from 130 to 230 Mb/s, averaging around 160MB/s.
- The most active sites are ASCC, PIC and FNAL
- Mostly traffic from CMS
24 September 2007
- 0 tickets opened, 1 moved, 1 solved
- IN2P3-CC FTS downtime for upgrade; between 06:00 UTC and 16:00 UTC, the upgrade to FTS 2.0 will be done;
- Transfer ranging from 70 to 270 Mb/s, averaging around 130MB/s.
- The most active sites are ASCC, PIC and FNAL
- Mostly traffic from CMS
27074
- (2007.09.20) SRM problem on site INFN-T1
SOLVED
vo: atlas, alice, cms, lhcb
[163] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2488 Unknown error [number] (errno=111, serrno=0)
23 September 2007
- Database will shutdown for ASGC; From 00:00 to 06:00 UTC;
- ASGC Scheduled Power Maintenance; starting from 0:00(UTC) and end at 5:00(UTC); All grid services will remains unreachable during the maintenance window;
- Transfer ranging from 40 to 250 Mb/s, averaging around 90MB/s.
- The most active site is ASCC, PIC and FNAL
- Mostly traffic from CMS
22 September 2007
- Transfer ranging from 25 to 130 Mb/s, averaging around 80MB/s.
- The most active site is PIC and FNAL
- Mostly traffic from CMS
21 September 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 45 to 195 Mb/s, averaging around 90MB/s.
- The most active site is FZK, PIC and FNAL
- Mostly traffic from CMS
27074
- (2007.09.20) SRM problem on site INFN-T1
vo: atlas, alice, cms, lhcb
[163] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2488 Unknown error [number] (errno=111, serrno=0)
20 September 2007
- 1 tickets opened, 0 moved, 0 solved
- ASCC Due to CMS CSA07 is coming soon and castor 2.1.4-5 has lot of important bug fix. We plan to upgrade our castor2 from 2.1.3-17 to 2.1.4-5. All service of castor2 will be down from 01:00AM(UTC) to 09:00AM(UTC),
- RAL On Thursday the 20th of September 2007 between 12.00AM and 1.00AM CEST,the configuration of the connection between CERN and RAL (the British Tier1 centre) will be modified.
- Transfer ranging from 20 to 90 Mb/s, averaging around 45MB/s.
- The most active site is FZK, PIC and FNAL
- Mostly traffic from CMS
27074
- (2007.09.20) SRM problem on site INFN-T1
(
moved to 2007.09.21)
19 September 2007
- 0 tickets opened, 0 moved, 0 solved
- Scheduled downtime of CASTOR instance at INFN-T1 CNAF; Downtime will begin at: 07:00h; A new release of the Castor software will be installed, fixing several operations problems.
- Transfer ranging from 19 to 115 Mb/s, averaging around 45MB/s.
- The most active site is FZK, IN2P3 and FNAL
- Mostly traffic from CMS
18 September 2007
- 0 tickets opened, 0 moved, 0 solved
- due to the migration of WN OS, TW-FTT will have scheduled maintenance this Tue., start from Sep 18 1AM (UTC) and expect the site service reconfiguration could be done at Sep 18 9AM.
- IN2P3 Due to a network maintenance at IN2P3, the CIC will be propably unreachable this afternoon from 12:00 untill 13:00 ( UTC ).
- Transfer ranging from 20 to 110 Mb/s, averaging around 50MB/s.
- The most active site is PIC and FNAL
- Mostly traffic from CMS
17 September 2007
- 0 tickets opened, 0 moved, 0 solved
- TRIUMF There will be power work affecting the main machine room on 15th and 16th September. The CEs and part of the storage will be unavailable. The FTS, LFC, RAC will remain on. On Monday 17th we`ll move these other grid services to the new machine room, and they will be unavailable from 09:00 - 14:00.
- Transfer ranging from 20 to 75 Mb/s, averaging around 35MB/s.
- The most active site is PIC and INFN
- Mostly traffic from Atlas and CMS
16 September 2007
- Transfer ranging from 10 to 25 Mb/s, averaging around 20MB/s.
- The most active site is PIC, FZK and INFN
- Mostly traffic from Atlas and CMS
15 September 2007
- TRIUMF maintanance 15-17th Sep. There will be power work affecting the main machine room on 15th and 16th September. The CEs and part of the storage will be unavailable. The FTS, LFC, RAC will remain on. On Monday 17th we`ll move these other grid services to the new machine room, and they will be unavailable from 09:00 - 14:00.
- Transfer ranging from 10 to 120 Mb/s, averaging around 40MB/s.
- The most active site is PIC and INFN
- Mostly traffic from CMS and less from Atlas
14 September 2007
- 0 tickets opened, 0 moved, 0 solved
- BNL dCache backend tape storage system, HPSS, will be shutdown from Sep 09 to Sep 14 for upgrade. dCache will not be available during that period. We will shutdown dCache on 9am Sep 09 and bring it back on 6pm Sep 14. Sorry for the inconvenience caused. Please plan your services accordingly
- Transfer ranging from 80 to 190 Mb/s, averaging around 120MB/s.
- The most active site is FNAL and FZK
- Mostly traffic from CMS
13 September 2007
- 0 tickets opened, 0 moved, 0 solved
- BNL dCache backend tape storage system, HPSS, will be shutdown from Sep 09 to Sep 14 for upgrade. dCache will not be available during that period. We will shutdown dCache on 9am Sep 09 and bring it back on 6pm Sep 14. Sorry for the inconvenience caused. Please plan your services accordingly
- Transfer ranging from 70 to 270 Mb/s, averaging around 120MB/s.
- The most active site is ASCC and INFN
- Mostly traffic from CMS
12 September 2007
- 0 tickets opened, 1 moved, 1 solved
- BNL dCache backend tape storage system, HPSS, will be shutdown from Sep 09 to Sep 14 for upgrade. dCache will not be available during that period. We will shutdown dCache on 9am Sep 09 and bring it back on 6pm Sep 14. Sorry for the inconvenience caused. Please plan your services accordingly
- Transfer ranging from 40 to 180 Mb/s, averaging around 70MB/s.
- The most active site is FNAL, FZK and PIC
- Mostly traffic from CMS
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
SOLVED
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
11 September 2007
- 0 tickets opened, 1 moved, 0 solved
- IN2P3 The FTS server at IN2P3-CC is currently unavailable. They are working on fixing this as soon at possible.
- ASCC due to the relocation of ASGC Ibm TS3500, we have to offline tape library from castor tape pool. and tape servers will be relocated into new machine room within same maintenance event. service will offline start from 9:30am (cst) and expect to be complete before 4pm (cst) next Tue. (9/11)
- BNL dCache backend tape storage system, HPSS, will be shutdown from Sep 09 to Sep 14 for upgrade. dCache will not be available during that period. We will shutdown dCache on 9am Sep 09 and bring it back on 6pm Sep 14. Sorry for the inconvenience caused. Please plan your services accordingly
- Transfer ranging from 45 to 160 Mb/s, averaging around 80MB/s.
- The most active site is FNAL, ASGC and DESY
- Mostly traffic from CMS
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.12)
10 September 2007
- ASCC due to the relocation of ASGC Ibm TS3500, we have to offline tape library from castor tape pool. and tape servers will be relocated into new machine room within same maintenance event. service will offline start from 9:30am (cst) and expect to be complete before 4pm (cst) next Tue. (9/11)
- BNL dCache backend tape storage system, HPSS, will be shutdown from Sep 09 to Sep 14 for upgrade. dCache will not be available during that period. We will shutdown dCache on 9am Sep 09 and bring it back on 6pm Sep 14. Sorry for the inconvenience caused. Please plan your services accordingly
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 40 to 150 Mb/s, averaging around 80MB/s.
- The most active site is ASGC, INFN-T1 and FNAL
- Mostly traffic from CMS
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.11)
9 September 2007
- Transfer ranging from 50 to 280 Mb/s, averaging around 100MB/s.
- The most active site is FNAL, INFN-T1 and RAL
- Mostly traffic from CMS and less from Atlas and Biomed
8 September 2007
- Transfer ranging from 60 to 330 Mb/s, averaging around 180MB/s.
- The most active site is ASGC, INFN-T1, FZK and RAL
- Mostly traffic from Atlas and CMS and less from Biomed
7 September 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 100 to 420 Mb/s, averaging around 250MB/s.
- The most active site is ASGC, IN2P3, INFN-T1 and RAL
- Mostly traffic from Atlas and CMS
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.10)
6 September 2007
- 0 tickets opened, 2 moved, 1 solved
- Transfer ranging from 110 to 260 Mb/s, averaging around 140MB/s.
- The most active site is ASGC, FNAL, FZK and RAL
- Mostly traffic from Atlas and CMS and less from Biomed
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.07)
26379
- (2007.08.29) SRM problem on site INFN-T1
SOLVED
vo: atlas, cms, lhcb
[309] SOURCE during PREPARATION phase: [CONNECTION] failed to contact on remote SRM [srm]. Givin' up after 3 tries
5 September 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 60 to 230 Mb/s, averaging around 100MB/s.
- The most active site is ASGC, RAL and FNAL
- Mostly traffic from Atlas and CMS
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.06)
26379
- (2007.08.29) SRM problem on site INFN-T1
(
moved to 2007.09.06)
4 September 2007
- 0 tickets opened, 2 moved, 0 solved
- INFN-T1 CNAF down. The expected down duration is one day, from tomorrow 4 September morning.
- IN2P3-LAPP SE will be offline for an update between 12 to 15 UTC on the 4th September for an update. This will affect access to the SE lapp-se01.in2p3.fr.
- Transfer ranging from 80 to 400 Mb/s, averaging around 150MB/s.
- The most active site is ASGC, BNL, PIC, FNAL and TRIUMF
- Mostly traffic from Atlas and less from CMS and Biomed
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.05)
26379
- (2007.08.29) SRM problem on site INFN-T1
(
moved to 2007.09.05)
3 September 2007
- 0 tickets opened, 2 moved, 0 solved
- RAL-LCG2 CASTOR LHCb Instance Downtime on September 3 13:00-15:00 UTC
- Transfer ranging from 100 to 640 Mb/s, averaging around 340MB/s.
- The most active site is ASGC, BNL, FNAL, FZK
- Mostly traffic from Atlas and less from CMS and Biomed
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.09.04)
26379
- (2007.08.29) SRM problem on site INFN-T1
(
moved to 2007.09.04)
2 September 2007
- 0 tickets opened, 3 moved, 1 solved
- Transfer ranging from 140 to 670 Mb/s, averaging around 370MB/s.
- The most active site is INFN, IN2P3 and BNL
- Mostly traffic from Atlas, CMS and Biomed
1 September 2007
- 0 tickets opened, 3 moved, 1 solved
- Transfer ranging from 160 to 700 Mb/s, averaging around 370MB/s.
- The most active site is BNL, FZK, PIC and IN2P3
- Mostly traffic from Atlas and CMS
31 August 2007
- 0 tickets opened, 3 moved, 1 solved
- The NDGF-T1 FTS service will be out of production due to an upgrade. The downtime will start 07:00 UTC (09:00 CET) on August 30, possibly stretching as far as 14:00 UTC (16:00 CET) on August 31
- Transfer ranging from 130 to 530 Mb/s, averaging around 280MB/s.
- The most active site is BNL, FNAL, FZK, NDGF and PIC
- Mostly traffic from Atlas and CMS
26430
- (2007.08.30) FTS transfer - Transfer problems between sites CERN-PROD and
IN2P3-CC
SOLVED
vo: alice, atlas, cms, lhcb
[358] TRANSFER during TRANSFER phase: [GRIDFTP] an invalid value for url was used
Reason: Authentication configuration problem in "ccsrm.in2p3.fr" fixed.
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
26379
- (2007.08.29) SRM problem on site INFN-T1
vo: atlas, cms, lhcb
[309] SOURCE during PREPARATION phase: [CONNECTION] failed to contact on remote SRM [srm]. Givin' up after 3 tries
30 August 2007
- 2 tickets opened, 1 moved, 0 solved
- The NDGF-T1 FTS service will be out of production due to an upgrade. The downtime will start 07:00 UTC (09:00 CET) on August 30, possibly stretching as far as 14:00 UTC (16:00 CET) on August 31
- Transfer ranging from 160 to 550 Mb/s, averaging around 300MB/s.
- The most active site is INFN, FZK, IN2P3, ASCC and SARA
- Mostly traffic from Atlas and CMS
26430
- (2007.08.30) FTS transfer - Transfer problems between sites CERN-PROD and
IN2P3-CC
(
moved to 2007.08.31)
26429
- (2007.08.30) SRM problem on site
RAL-LCG2
(
moved to 2007.08.31)
26379
- (2007.08.29) SRM problem on site INFN-T1
(
moved to 2007.08.31)
29 August 2007
- 1 tickets opened, 2 moved, 2 solved
- Transfer ranging from 20 to 580 Mb/s, averaging around 250MB/s.
- The most active site is TRIUMF, FZK, IN2P3, NDGF and SARA
- Mostly traffic from Atlas and less CMS
26379
- (2007.08.29) SRM problem on site INFN-T1
(
moved to 2007.08.30)
25687
- (2007.08.10) SRM problem on site INFN-T1
SOLVED
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
--------
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE];
--------
[347] DESTINATION during PREPARATION phase: [CONNECTION] service timeout during [SrmPut];
--------
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0);
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC
SOLVED
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
28 August 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 40 to 210 Mb/s, averaging around 120MB/s.
- The most active site is TRIUMF, BNL and SARA
- Mostly traffic from CMS and Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.29)
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC
(
moved to 2007.08.29)
27 August 2007
- 0 tickets opened, 2 moved, 0 solved
- INFN-CNAF will be down from 2007-08-27 0900 UTC to 2007-08-28 1400 UTC.
- TRIUMF FTS will be unavailable on Monday 27th Aug 9-17PDT. This is for the 2.0 upgrade.
- Transfer ranging from 20 to 230 Mb/s, averaging around 90MB/s.
- The most active site is TRIUMF, RAL and IN2P3
- Mostly traffic from Atlas and less from CMS
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.28)
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC
(
moved to 2007.08.28)
26 August 2007
- CNAF will be down from 2007-08-27 0900 UTC to 2007-08-28 1400 UTC
- Transfer ranging from 20 to 90 Mb/s, averaging around 40MB/s.
- The most active site is ASGC, BNL, FZK and IN2P3
- Mostly traffic from CMS and Atlas
25 August 2007
- Transfer ranging from 30 to 200 Mb/s, averaging around 60MB/s.
- The most active site is ASGC, BNL and IN2P3
- Mostly traffic from CMS and Atlas
24 August 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 10 to 270 Mb/s, averaging around 70MB/s.
- The most active site is ASGC, BNL, CNAF, FZK, IN2P3 and PIC
- Mostly traffic from CMS and Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
--------
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE];
--------
[347] DESTINATION during PREPARATION phase: [CONNECTION] service timeout during [SrmPut];
--------
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0);
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
23 August 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 10 to 150 Mb/s, averaging around 30MB/s.
- The most active site is CNAF, ASGC and PIC
- Mostly traffic from CMS
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.24)
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.24)
22 August 2007
- 0 tickets opened, 3 moved, 1 solved
- Transfer ranging from 10 to 70 Mb/s, averaging around 40MB/s.
- The most active site is FZK, IN2P3 and PIC
- Mostly traffic from CMS
26102
- (2007.08.21) SRM problem on site
IN2P3-CC
SOLVED
vo: cms
[23] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] RequestFileStatus#[id] failed with error:[ [DATE] state Failed : GetStorageInfoFailed : file exists, cannot write
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.23)
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.23)
21 August 2007
- 1 tickets opened, 2 moved, 0 solved
- SRM/dCache downtime at IN2P3. The SRM/dCache system will be down at IN2P3-CC for software upgrade on Tuesday 21st of August from 9:00 to 15:00 UTC
- Transfer ranging from 10 to 100 Mb/s, averaging around 40MB/s.
- The most active site is BNL, FNAL, IN2P3 and PIC
- Mostly traffic from CMS and Atlas
26102
- (2007.08.21) SRM problem on site
IN2P3-CC
(
moved to 2007.08.22)
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.22)
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.22)
20 August 2007
- 0 tickets opened, 3 moved, 1 solved
- The dCache systems at RAL-LCG2 will be offline for an update between 9am and 1pm UTC for an update.
- Transfer ranging from 20 to 270 Mb/s, averaging around 110MB/s.
- The most active site is PIC, IN2P3, FNAL and CNAF
- Mostly traffic from CMS
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.21)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
SOLVED
vo: atlas
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE]
24857
- (2007.07.16) Transfer problems between sites CERN-PROD and
IN2P3-CC
(
moved to 2007.08.21)
19 August 2007
- Transfer ranging from 40 to 230 Mb/s, averaging around 90MB/s.
- The most active site is PIC, IN2P3 and FNAL
- Mostly traffic from CMS
18 August 2007
- Transfer ranging from 50 to 220 Mb/s, averaging around 90MB/s.
- The most active site is PIC and FNAL
- Mostly traffic from CMS and less from Atlas
17 August 2007
- 0 tickets opened, 3 moved, 0 solved
- Transfer ranging from 10 to 320 Mb/s, averaging around 140MB/s.
- The most active site is PIC, IN2P3 and BNL
- Mostly traffic from CMS and Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.20)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.20)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.20)
16 August 2007
- 0 tickets opened, 3 moved, 0 solved
- BNL dCache downtime for upgrade on Aug. 16th The US Atlas SRM/dCache system will be down at BNL for software upgrade on Thursday 16st of August from 13:00 to 16:00 UTC
- Scheduled downtime of CASTORLHCB instance at CERN from 07:00h, Thursday 16th August
- Transfer ranging from 60 to 680 Mb/s, averaging around 300MB/s.
- The most active site is RAL, IN2P3 and FNAL
- Mostly traffic from CMS and less from Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.17)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.17)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.17)
15 August 2007
- 0 tickets opened, 3 moved, 0 solved
- CERN-PROD upgrade the CASTORCMS instance. Downtime will begin at: 07:00h (UTC) end at: 09:00h (UTC)
- Two maintenance interventions in the LHCOPN, the first between 10.00AM and 11.00AM CET, the second between 16:00 and 17:00 CET. The intervention will concern the routing to ASGC
- Transfer ranging from 20 to 520 Mb/s, averaging around 280MB/s.
- The most active site is BNL and FNAL
- Mostly traffic from CMS and less from Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.16)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.16)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.16)
14 August 2007
- 0 tickets opened, 4 moved, 1 solved
- Scheduled interruption to the CERN to RAL OPN circuit for 10 minutes in the period 06:00 to 06:30 GMT Thursday 14th August. This is due to a router software upgrade and reboot. CERN-RAL and RAL-CERN Castor and dCache transfers will be affected.
- Short network interruptions at CERN between 06:00-06:30
- Transfer ranging from 100 to 520 Mb/s, averaging around 260MB/s.
- The most active site is RAL and FNAL
- Mostly traffic from CMS and less from Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
(
moved to 2007.08.15)
25639
- (2007.08.09) SRM problem on site
IN2P3-CC
SOLVED
vo: cms
[23] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] RequestFileStatus#[id] failed with error:[ [DATE] state Failed : GetStorageInfoFailed : file exists, cannot write
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.15)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.15)
13 August 2007
- 0 tickets opened, 4 moved, 0 solved
- Transfer ranging from 150 to 600 Mb/s, averaging around 290MB/s.
- The most active site is RAL and FNAL
- Mostly traffic from CMS and less from Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
--------
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE];
--------
[347] DESTINATION during PREPARATION phase: [CONNECTION] service timeout during [SrmPut];
--------
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0);
25639
- (2007.08.09) SRM problem on site
IN2P3-CC
vo: cms
[23] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] RequestFileStatus#[id] failed with error:[ [DATE] state Failed : GetStorageInfoFailed : file exists, cannot write
25141
- (2007.07.25) SRM problem on site
RAL-LCG
vo: atlas
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE]
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
12 August 2007
- Transfer ranging from 100 to 450 Mb/s, averaging around 280MB/s.
- The most active site is RAL, BNL, FNAL and PIC
- Mostly traffic from from CMS and less from Atlas
11 August 2007
- Transfer ranging from 150 to 400 Mb/s, averaging around 280MB/s.
- The most active site is RAL, BNL, FNAL and PIC
- Mostly traffic from CMS and less from Atlas
10 August 2007
- 1 tickets opened, 3 moved, 0 solved
- Transfer ranging from 130 to 360 Mb/s, averaging around 240MB/s.
- The most active site is RAL, BNL, FZK, FNAL and PIC
- Mostly traffic from CMS and less from Atlas
25687
- (2007.08.10) SRM problem on site INFN-T1
vo: atlas, cms, lhcb
[306] failed to prepare Destination file in 180 seconds;
--------
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE];
--------
[347] DESTINATION during PREPARATION phase: [CONNECTION] service timeout during [SrmPut];
--------
[21] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0);
25639
- (2007.08.09) SRM problem on site
IN2P3-CC
vo: cms
[23] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] RequestFileStatus#[id] failed with error:[ [DATE] state Failed : GetStorageInfoFailed : file exists, cannot write
25141
- (2007.07.25) SRM problem on site
RAL-LCG
vo: atlas
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE]
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
9 August 2007
- 1 tickets opened, 2 moved, 0 solved
- Transfer ranging from 140 to 480 Mb/s, averaging around 260MB/s.
- The most active site is RAL, IN2P3, FZK, FNAL and PIC
- Mostly traffic from CMS
25639
- (2007.08.09) SRM problem on site
IN2P3-CC
(
moved to 2007.08.10)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.10)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.10)
8 August 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 120 to 210 Mb/s, averaging around 150MB/s.
- The most active site is RAL, IN2P3 and PIC
- Mostly traffic from from CMS, less from LHCb
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.09)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.09)
7 August 2007
- 0 tickets opened, 3 moved, 1 solved
- Transfer ranging from 150 to 520 Mb/s, averaging around 330MB/s.
- The most active site is FNAL and IN2P3
- Mostly traffic from CMS
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas, alice, lhcb
[349] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
-----------------
[306] FINAL:DESTINATION: failed to prepare Destination file in 180 seconds
-----------------
[321] FINAL:DESTINATION: destination file failed on the SRM with error [SRM_FAILURE]
Reason: They had some connection problems with DB so Castor wasn't working well
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.08)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.08)
6 August 2007
- 0 tickets opened, 3 moved, 0 solved
- CERN-PROD: 9:00 CEST (7:00 UTC) to 12:00 CEST (10:00 UTC) at CERN many lcg-RB and glite-WMS+LB are going to be restarted for kernel upgrade.
- CERN-PROD: There will be long waiting times as writing data to Castor will imply storing data on disksfirst and tape staging will be delayed.
- Transfer ranging from 20 to 590 Mb/s, averaging around 100MB/s.
- The most active site is FNAL, RAL and IN2P3
- Mostly traffic from CMS
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.08.07)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.07)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.07)
5 August 2007
- Transfer ranging from 30 to 400 Mb/s, averaging around 270MB/s.
- The most active site is RAL, PIC, IN2P3, CNAF and FNAL
- Mostly traffic from from CMS
4 August 2007
- Transfer ranging from 120 to 450 Mb/s, averaging around 250MB/s.
- The most active site is RAL, PIC, IN2P3 and FNAL
- Mostly traffic from from CMS
3 August 2007
- 0 tickets opened, 3 moved, 0 solved
- Transfer ranging from 190 to 580 Mb/s, averaging around 300MB/s.
- The most active site is RAL and FNAL
- Mostly traffic from from CMS
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.08.06)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.06)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.06)
2 August 2007
- 0 tickets opened, 4 moved, 1 solved
- RAL: The RAL-LCG2 FTS will be offline on the 2nd August from 9am to 11am UTC for kernel updates and Oracle patching
- PIC: The picatlas and piclhcb databases will be stop tomorrow during 2 hours for maintenance.
- Transfer ranging from 40 to 480 Mb/s, averaging around 250MB/s.
- The most active site is RAL and CNAF
- Mostly traffic from from CMS
25353
- (2007.08.01) FTS transfer - SRM problem on site SARA-MATRIX
SOLVED
vo: atlas, alice, lhcb
[350] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] RequestFileStatus[id] failed with error:[ [date] state Failed : surl is not local : [sirl]
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.08.03)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.03)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.03)
1 August 2007
- 1 tickets opened, 3 moved, 0 solved
- Transfer ranging from 80 to 480 Mb/s, averaging around 250MB/s.
- The most active site is BNL, PIC and IN2P3
- Mostly traffic from Atlas and less from CMS
25353
- (2007.08.01) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.08.02)
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.08.02)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.02)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.02)
31 July 2007
- 0 tickets opened, 3 moved, 0 solved
- Transfer ranging from 260to 900 Mb/s, averaging around 500MB/s.
- The most active site is FNAL, CNAF, BNL and PIC
- Mostly traffic from Atlas and less from CMS
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.08.01)
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.08.01)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.08.01)
30 July 2007
- 1 tickets opened, 3 moved, 1 solved
- Transfer ranging from 600to 900 Mb/s, averaging around 780MB/s.
- The most active site is FNAL, CNAF, IN2P3, FZK and TRIUMF
- Mostly traffic from Atlas and less from CMS
25251
- (2007.07.30) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.07.31)
25219
- (2007.07.27) FTS transfer - Transfer problems between sites CERN-PROD and SARA-MATRIX
SOLVED
vo: atlas, alice, lhcb
[14] the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
Reason: A few dcache pool nodes ran out of space on the /var file system again. This causes gridftp doors to fail. When a gridftp door is idling it clogs up the log file by logging a messages stating that it is idling several time per second. This way the gridftp door log files are getting huge and eventually they fill up the file system completely
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.07.31)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.31)
29 July 2007
- Transfer ranging from 360 to 820 Mb/s, averaging around 600MB/s.
- The most active site is TRIUMF, IN2P3 and PIC
- Mostly traffic from Atlas and less from CMS
28 July 2007
- Transfer ranging from 400 to 900 Mb/s, averaging around 500MB/s.
- The most active site is FNAL and TRIUMF
- Mostly traffic from Atlas and less from CMS
27 July 2007
- 1 tickets opened, 3 moved, 1 solved
- Transfer ranging from 500 to 780 Mb/s, averaging around 700MB/s.
- The most active site is CNAF, IN2P3 and TRIUMF
- Mostly traffic from Atlas and less from CMS
25219
- (2007.07.27) FTS transfer - Transfer problems between sites CERN-PROD and SARA-MATRIX
vo: atlas, alice, lhcb
[14] the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
25141
- (2007.07.25) SRM problem on site
RAL-LCG
vo: atlas
[321] DESTINATION during PREPARATION phase: [GENERAL_FAILURE] destination file failed on the SRM with error [SRM_FAILURE]
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
SOLVED
vo: atlas
[332] DESTINATION during FINALIZATION phase: [GENERAL_FAILURE] failed to complete PrepareToPut request [id] on remote SRM [srm]: [SRM_FAILURE] ]. The PrepareToPut request has been successfully aborted
Reason: They think it was due to large files exposing an inconsistency between physical and logical space on our pools. This is being fixed
26 July 2007
- 0 tickets opened, 3 moved, 0 solved
- Transfer ranging from 520 to 960 Mb/s, averaging around 800MB/s.
- The most active site is FNAL, CNAF, IN2P3, FZK and TRIUMF
- Mostly traffic from Atlas and less from CMS
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.07.27)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.27)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.27)
25 July 2007
- 1 tickets opened, 2 moved, 0 solved
- Transfer ranging from 700 to 970 Mb/s, averaging around 800MB/s.
- The most active site is IN2P3, CNAF and TRIUMF
- Mostly traffic from CMS and Atlas
25141
- (2007.07.25) SRM problem on site
RAL-LCG
(
moved to 2007.07.26)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.26)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.26)
24 July 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 400 to 1060 Mb/s, averaging around 150MB/s.
- The most active site is CNAF, IN2P3, SARA, FNAL and PIC
- Mostly traffic from CMS and Atlas
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.25)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.25)
23 July 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 10 to 600 Mb/s, averaging around 150MB/s.
- The most active site is CNAF, IN2P3 and SARA
- Mostly traffic from CMS and Atlas
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.24)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.24)
22 July 2007
- Transfer ranging from 50 to 490 Mb/s, averaging around 260MB/s.
- The most active site is CNAF, FNAL and IN2P3
- Mostly traffic from CMS and Atlas
21 July 2007
- Transfer ranging from 50 to 430 Mb/s, averaging around 150MB/s.
- The most active site is CNAF, IN2P3, PIC, FNAL and BNL
- Mostly traffic from CMS and Atlas
20 July 2007
- 0 tickets opened, 4 moved, 2 solved
- Transfer ranging from 40 to 380 Mb/s, averaging around 240MB/s.
- The most active site is CNAF and IN2P3
- Mostly traffic from Atlas and less from CMS
24919
- (2007.07.19) SRM problem on site INFN-T1
SOLVED
vo: cms, alice, atlas, lhcb
[306] failed to prepare Destination file in 180 seconds
--------
[48] the server sent an error response: 553 553 [addres]: Timed out
Reason: two main reasons. cms transfers failed because of a disk servers with strange problem, not very frequent. it has been put in Draining until it is fixed.files go successfully into the other cms disk server.atlas transfers failed also for problems in two disk servers. also this is fixed now.
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC
vo: cms, alice, atlas
[12] 451 Local resource failure: malloc: Cannot allocate memory
--------
[52] a system call failed (Connection refused)
24649
- (2007.07.13) Transfer problem on site INFN-T1
SOLVED
vo: cms
[314] TRANSFER during TRANSFER phase: [INVALID_SIZE]
Reason:/castor/cnaf.infn.it/grid/lcg/cms/tape/mc/2006/12/22/mc-physval-120-BBbar80to120-LowLumiPU/0008/2C343C8A-9CB9-DB11-8857-0030487219AB.root was corrupted, size of real file < of size in the name server.probably it is due to a filesystem problem occurred months ago.
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
vo: atlas
[332] DESTINATION during FINALIZATION phase: [GENERAL_FAILURE] failed to complete PrepareToPut request [id] on remote SRM [srm]: [SRM_FAILURE] ]. The PrepareToPut request has been successfully aborted
19 July 2007
- 1 tickets opened, 3 moved, 0 solved
- Transfer ranging from 80 to 430 Mb/s, averaging around 250MB/s.
- The most active site is CNAF, FNAL and PIC
- Mostly traffic from CMS and less from Atlas
24919
- (2007.07.19) SRM problem on site INFN-T1
(
moved to 2007.07.20)
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.20)
24649
- (2007.07.13) Transfer problem on site INFN-T1
(
moved to 2007.07.20)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.20)
18 July 2007
- 1 tickets opened, 2 moved, 0 solved
- Transfer ranging from 50 to 560 Mb/s, averaging around 200MB/s.
- The most active site is SARA, BNL and FNAL
- Mostly traffic from CMS and Atlas
24857
- (2007.07.18) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.19)
24649
- (2007.07.13) Transfer problem on site INFN-T1
(
moved to 2007.07.19)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.19)
17 July 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 20 to 380 Mb/s, averaging around 130MB/s.
- The most active site is SARA, FNAL, IN2P3 and JINR
- Mostly traffic from CMS and less from Atlas
24649
- (2007.07.13) Transfer problem on site INFN-T1
(
moved to 2007.07.17)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.17)
16 July 2007
- 0 tickets opened, 3 moved, 1 solved
- Transfer ranging from 230 to 510 Mb/s, averaging around 310MB/s.
- The most active site is SARA and PIC
- Mostly traffic from CMS and Atlas
24649
- (2007.07.13) Transfer problem on site INFN-T1
(
moved to 2007.07.17)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.17)
24575
- (2007.07.11) Transfer problems between sites FZK-LCG2 and CERN-PROD
SOLVED
vo: atlas, alice, cms, lhcb
[8] the server sent an error response: 425 425 Can't open data connection. timed out() failed.
Reason: Conclusion is that it is caused by temporary problems on the CASTOR side
15 July 2007
- Transfer ranging from 140 to 790 Mb/s, averaging around 400MB/s.
- The most active site is FNAL, SARA and CNAF
- Mostly traffic from CMS and Atlas
14 July 2007
- Transfer ranging from 180 to 830 Mb/s, averaging around 660MB/s.
- The most active site is FNAL, SARA, PIC and BNL
- Mostly traffic from CMS and less from Atlas
13 July 2007
- 2 tickets opened, 1 moved, 0 solved
- Transfer ranging from 410 to 850 Mb/s, averaging around 610MB/s.
- The most active site is FNAL, SARA, PIC and BNL
- Mostly traffic from CMS and Atlas
24649
- (2007.07.13) Transfer problem on site INFN-T1
(
moved to 2007.07.16)
24646
- (2007.07.13) SRM problem on site TRIUMF-LCG2
(
moved to 2007.07.16)
24575
- (2007.07.11) Transfer problems between sites FZK-LCG2 and CERN-PROD (
moved to 2007.07.16)
12 July 2007
- 0 tickets opened, 1 moved, 0 solved
- Transfer ranging from 280 to 840 Mb/s, averaging around 500MB/s.
- The most active site is FNAL
- Mostly traffic from CMS and Atlas
24575
- (2007.07.11) Transfer problems between sites FZK-LCG2 and CERN-PROD
(
moved to 2007.07.13)
11 July 2007
- 1 tickets opened, 0 moved, 0 solved
- Transfer ranging from 230 to 520 Mb/s, averaging around 370MB/s.
- The most active sites are FNAL and PIC
- Mostly traffic from CMS
24575
- (2007.07.11) Transfer problems between sites FZK-LCG2 and CERN-PROD (
moved to 2007.07.12)
10 July 2007
- 1 tickets opened, 0 moved, 1 solved
- Transfer ranging from 180 to 470 Mb/s, averaging around 340MB/s.
- The most active sites are FNAL, PIC and FZK
- Mostly traffic from CMS
24501
- (2007.07.10) Transfer problems between sites SARA-MATRIX and CERN-PROD
SOLVED
vo: atlas
[10] TRANSFER phase. Error [GRIDFTP]:the server sent an error response: 500 500 java.lang.reflect.InvocationTargetException: <stor>
------------
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
------------
[340] globus_l_ftp_control_send_cmd_cb: gss_init_sec_context failed
----------
[14] TRANSFER during TRANSFER phase: [GRIDFTP] the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
Reason: They have had problems with full fils systems on their dcache pool nodes. This has been fixed
9 July 2007
- 0 tickets opened, 0 moved, 0 solved
- Transfer ranging from 180 to 540 Mb/s, averaging around 320MB/s.
- The most active sites are FNAL, PIC and FZK
- Mostly traffic from CMS
8 July 2007
- Transfer ranging from 20 to 310 Mb/s, averaging around 150MB/s.
- The most active sites are FNAL and IN2P3
- Mostly traffic from CMS
7 July 2007
- Transfer ranging from 60 to 360 Mb/s, averaging around 110MB/s.
- The most active sites are FNAL, PIC, IN2P3 and IFCA
- Mostly traffic from CMS
6 July 2007
- 0 tickets opened, 1 moved, 1 solved
- Transfer ranging from 20 to 410 Mb/s, averaging around 280MB/s.
- The most active sites are FNAL, FZK, PIC
- Mostly traffic from CMS, less from LHCb
22376
- (2007.07.05) Transfer problems between sites CERN-PROD and
IN2P3-CC
SOLVED
vo: atlas, alice, cms, lhcb
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
--------------------------
[11] TRANSFER phase. Error [GRIDFTP]:an end-of-file was reached
Reason: The maintenance on dCache pools is over now. Ther should not be anymore timeouts on the pools which usualy lead to the 451 error for gridFTP.
5 July 2007
- 1 tickets opened, 0 moved, 0 solved
- Transfer ranging from 280 to 470 Mb/s, averaging around 380MB/s.
- The most active sites are FNAL, FZK, PIC and IN2P3
- Mostly traffic from CMS, less from LHCb
22376
- (2007.07.05) Transfer problems between sites CERN-PROD and
IN2P3-CC (
moved to 2007.07.06)
4 July 2007
- 0 tickets opened, 4 moved, 4 solved
- Transfer ranging from 0 to 460 Mb/s, averaging around 180MB/s.
- The most active sites are FNAL, PIC and FZK
- Mostly traffic from CMS
23904
- (2007.06.28) Transfer problems between sites SARA-MATRIX and CERN-PROD
_SOLVED
vo: atlas
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD
_SOLVED
vo: atlas
[8] the server sent an error response: 425 425 Can't open data connection. timed out() failed.
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
_SOLVED
vo: atlas, cms, lhcb
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
------------------------
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrmb.rl.ac.uk:8443/srm/managerv1 ; id=797902119 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
Reason: that this implies the destination file already exists
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1
SOLVED
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
Reason: or this problem was responsible the power cut that we've had here at CNAF, and all principal services, LFC, SRM, FTS, ... were unavailable
3 July 2007
- 0 tickets opened, 4 moved, 0 solved
- Transfer ranging from 20 to 440 Mb/s, averaging around 180MB/s.
- The most active sites are FNAL, PIC, IN2PCC and FZK
- Mostly traffic from CMS
23904
- (2007.06.28) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.07.04)
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.07.04)
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.07.04)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.07.04)
2 July 2007
- 0 tickets opened, 4 moved, 0 solved
- Transfer ranging from 10 to 300 Mb/s, averaging around 80MB/s.
- The most active sites are FNAL, IN2PCC and FZK
- Mostly traffic from CMS
23904
- (2007.06.28) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.07.03)
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.07.03)
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.07.03)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.07.03)
1 July 2007
- 0 tickets opened, 4 moved, 0 solved
30 Jun 2007
- 0 tickets opened, 4 moved, 0 solved
29 Jun 2007
- 0 tickets opened, 4 moved, 0 solved
23904
- (2007.06.28) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[12] the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD
vo: atlas
[8] the server sent an error response: 425 425 Can't open data connection. timed out() failed.
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
vo: atlas, cms, lhcb
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
------------------------
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrmb.rl.ac.uk:8443/srm/managerv1 ; id=797902119 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
28 Jun 2007
- 1 tickets opened, 3 moved, 0 solved
- Transfer ranging from 200 to 640 Mb/s, averaging around 500MB/s.
- The most active sites are TRIUMF, IN2PCC and PIC
- Mostly traffic from Atlas and CMS
23904
- (2007.06.28) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.29)
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.06.29)
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.29)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.29)
27 Jun 2007
- 2 tickets opened, 2 moved, 1 solved
- CERN-GRIDKA and GRIDKA-CERN channels of the FTS server "prod-fts-ws.cern.ch set to "Inactive" for maintenance
- Alice CASTORALICE-Upgrade on CERN-PROD
- SARA Experiencing problems with MSS. Will try to solve this permanently asap.
- Transfer ranging from 390 to 700 Mb/s, averaging around 500MB/s.
- The most active sites are SARA, TRIUMF and IN2PCC
- Mostly traffic from Atlas, less from CMS
23860
- (2007.06.27) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.06.28)
23853
- (2007.06.27) FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
[337] FINAL:TRANSFER: the server sent an error response: 451 451 Non-null return code from [>PoolManager@dCacheDomain:*@dCacheDomain] with error Best pool <se01_titan_uio_no_1> too high : 2.0E8
Reason: We currently believe this was a transient error, possibly caused by network problems between the SRM host and the pool host. The situation seems OK now
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.28)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.28)
26 Jun 2007
- 0 tickets opened, 2 moved, 0 solved
- Transfer ranging from 230 to 920 Mb/s, averaging around 590MB/s.
- The most active sites are FZK, IN2P3 and SARA
- Mostly traffic from Atlas
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.27)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.27)
25 Jun 2007
- 0 tickets opened, 4 moved, 0 solved and 2 closed
- Transfer ranging from 20 to 880 Mb/s, averaging around 400MB/s.
- The most active sites are FZK, CNAF, TRIUMF and BNL
- Mostly traffic from Atlas
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.26)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.26)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
CLOSED
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
CLOSED
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
24 Jun 2007
23 Jun 2007
22 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
vo: atlas, cms, lhcb
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
------------------------
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrmb.rl.ac.uk:8443/srm/managerv1 ; id=797902119 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 100 to 470 Mb/s, averaging around 370MB/s.
- The most active sites are FNAL, CNAF and BNL
- Mostly traffic from CMS and LHCb
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.21)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.21)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.21)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.21)
20 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 0 to 460 Mb/s, averaging around 300MB/s.
- The most active sites are FNAL, CNAF and FZK
- Mostly traffic from CMS
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.21)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.21)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.21)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.21)
19 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 20 to 490 Mb/s, averaging around 330MB/s.
- The most active sites are FNAL, CNAF and IN2PCC
- Mostly traffic from CMS
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.20)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.20)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.20)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.20)
18 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 10 to 320 Mb/s, averaging around 120MB/s.
- The most active sites are FNAL and IN2PCC
- Mostly traffic from CMS
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.19)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.19)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.19)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.19)
17 Jun 2007
16 Jun 2007
15 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 50 to 430 Mb/s, averaging around 180MB/s.
- The most active sites are FZK and IN2PCC
- Mostly traffic from CMS
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
vo: atlas, cms, lhcb
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
------------------------
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrmb.rl.ac.uk:8443/srm/managerv1 ; id=797902119 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
14 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- FZK: There is a problem with the database and currently the service is down. They are fixing but it may take another few hours.
- RAL: CASTOR seems to have reached a state of near meltdown overnight. They are currently trying to recover it, but there may be short interruptions to the service.
- Transfer ranging from 40 to 380 Mb/s, averaging around 160MB/s.
- The most active sites are RAL and FZK
- Mostly traffic from Atlas, CMS, LHCb
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.15)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.15)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.15)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.15)
13 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- Transfer ranging from 170 to 580 Mb/s, averaging around 300MB/s.
- The most active sites are RAL and FZK
- Mostly traffic from CMS
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.14)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.14)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.14)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.14)
12 Jun 2007
- 1 tickets opened, 4 moved and 1 solved
- IN2P3: The following SE will be closed - cclcgseli02.in2p3.fr and Dcache SE will work in a degraded mode - ccsrm.in2p3.fr
- Transfer ranging from 400 to 1200 Mb/s, averaging around 700MB/s.
- The most active sites are RAL, BNL and FZK
- Mostly traffic from CMS and Atlas
23250
- (2007.06.12) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.13)
23071
- (2007.06.08) FTS transfer - dCash problem on site FZK-LCG2
SOLVED
vo: alice, atlas, cms, lhcb
[93] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://gridka-dcache.fzk.de:8443/srm/managerv1 ; id=-2135373137 call. Error is RequestFileStatus#-2135373136 failed with error:[ at Fri Jun 08 12:50:14 CEST 2007 state Failed : No Route to cell for packet {uoid=<1181299814007:25659576>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
Reason: The Pin Manager of dCache crashed and has been restarted
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.13)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.06.13)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.13)
11 Jun 2007
- 0 tickets opened, 4 moved and 0 solved
- SARA: On monday june 11th the sara-r1 router will be in maintenance for hardware and software upgrades. This will take place between 18:00 and 19:30 CET. Although they do not expect problems you should take a reduced reliability into account
- Transfer ranging from 600 to 1100 Mb/s, averaging around 840MB/s.
- The most active sites are BNL, FZK and RAL
- Mostly traffic from Atlas
23071
- (2007.06.08) FTS transfer - dCash problem on site FZK-LCG2 (
moved to 2007.06.12)
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.12)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.12)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.12)
10 Jun 2007
9 Jun 2007
8 Jun 2007
- 1 tickets opened, 4 moved, 0 solved and 1 closed
- INFN: INFN-CNAF experienced a power cut last night. Network connection is still not fully operational. The T1 site, production and pre-production services may not be reachable.
- Transfer ranging from 480 to 870 Mb/s, averaging around 640MB/s.
- The most active sites are BNL, IN2PCC, FZK and PIC
- Mostly traffic from Atlas
23071
- (2007.06.08) FTS transfer - dCash problem on site FZK-LCG2
vo: alice, atlas, cms, lhcb
[93] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://gridka-dcache.fzk.de:8443/srm/managerv1 ; id=-2135373137 call. Error is RequestFileStatus#-2135373136 failed with error:[ at Fri Jun 08 12:50:14 CEST 2007 state Failed : No Route to cell for packet {uoid=<1181299814007:25659576>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
CLOSED
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
7 Jun 2007
- 1 tickets opened, 4 moved 1 solved
- Transfer ranging from 480 to 870 Mb/s, averaging around 680MB/s.
- The most active sites are BNL, IN2PCC and SARA
- Mostly traffic from Atlas
23010
- (2007.06.07) FTS transfer - "Could not open connection" on site INFN-T1 (
moved to 2007.06.08)
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1
SOLVED
vo: cms
[224] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server.userException - java.rmi.RemoteException: SRM Authorization failed; nested exception is:
org.dcache.srm.SRMAuthorizationException: diskCacheV111.services.authorization.AuthorizationServiceException:
Exception thrown by diskCacheV111.services.authorization.KPWDAuthorizationPlugin: Cannot determine Username from SubjectDN /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=teodoro/CN=664016/CN=Douglas Teodoro
Reason: There was a mapping problem due to new certificates not being inserted due to the existence of old ones.
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.08)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.08)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.06.08)
6 Jun 2007
- 0 tickets opened, 5 moved 1 solved
- IN2P3: They have an unexpected problem on our SRM server since 8a.m. and should be back online at noon.
- SARA: From 13:00 CET LFC_oracle and FTS at SARA will be down for maintenance for this afternoon in order to resolve the current problems of the oracle database.
- Transfer ranging from 500 to 900 Mb/s, averaging around 610MB/s.
- The most active sites are BNL, SARA and IN2P3
- Mostly traffic from Atlas and CMS
22890
- (2007.06.05) FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo: cms
[45] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !
Reason: The RAL-LCG2 Castor srms ralsrm[a-f].rl.ac.uk are down while Castor is being reconfigured
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.06.07)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.07)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.07)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.06.07)
5 Jun 2007
- 1 tickets opened, 4 moved 0 solved
- RAL: Access to the dcache tape backend has been disabled due to a problem with the configuration of the Fibre Channel switches that means that the system does not have access to its cache disks. To prevent problems, access to the system has been disabled. This will prevent restores for files stored under dcache-tape.gridpp.rl.ac.uk, writes will still work as they will be written into dCache cache disk and stored when the tape backend is available again
- IN2P3: They are facing a HPSS problem at IN2P3-CC. As a consequence: their Dcache SEs (ccsrm.in2p3.fr and ccsrm02.in2p3.fr) are running in degraded mode and classic SE (cclcgseli02.in2p3.fr) is unavailable
- Transfer ranging from 420 to 950 Mb/s, averaging around 690MB/s.
- The most active sites are BNL and FNAL
- Mostly traffic from Atlas and CMS
22890
- (2007.06.05) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.06.06)
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.06.06)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.06)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.06)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.06.06)
4 Jun 2007
- 1 tickets opened, 4 moved 1 solved
- NDGF: NDGF-T1 FTS had to be taken offline during the weekend due to database space problems. All the channels are set to inactive at the moment. We are looking into this ASAP
- FZK: Grid services at GridKa are currently unavailable because of network problems. We are working on it.
- Transfer ranging from 400 to 800 Mb/s, averaging around 600MB/s.
- The most active sites are BNL and FNAL
- Mostly traffic from Atlas and CMS
22607
- (2007.06.04) FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo: atlas, cms
[35] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: service timeout
Reason: The prob arise from high cpu load of castorsc, that the data subscription or copy will fail with intermittent error, this already proved as limitation of castorv1, and they trying to migrate the datasets reg in dq2 to castor v2 asap
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.06.05)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.06.05)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.05)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.06.05)
1 Jun 2007
- 0 tickets opened, 5 moved 1 solved
- Transfer ranging from 300 to 1200 Mb/s, averaging around 700MB/s.
- The most active sites are ASCC, FZK and RAL
- Mostly traffic from CMS and Atlas
22611
- (2007.05.29) FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo: atlas
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get;
[298] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://ralsrmf.rl.ac.uk:8443/srm/managerv1 ; id=862324178 call. Error is CastorStagerInterface.c:2162 Internal error (errno=98, serrno=0)%
Reason: CASTOR is marked as in downtime in the GOC DB.
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1
vo: cms
[224] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server.userException - java.rmi.RemoteException: SRM Authorization failed; nested exception is:
org.dcache.srm.SRMAuthorizationException: diskCacheV111.services.authorization.AuthorizationServiceException:
Exception thrown by diskCacheV111.services.authorization.KPWDAuthorizationPlugin: Cannot determine Username from SubjectDN /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=teodoro/CN=664016/CN=Douglas Teodoro
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
31 May 2007
- 0 tickets opened, 6 moved 1 solved
- NDGF: NDGF-T1 SRM service is running in degraded mode. Some of the files are not currently available and there might be interruptions to the transfers. We are working on fixing this ASAP. Estimated time to repair is later today.
- Transfer ranging from 150 to 640 Mb/s, averaging around 400MB/s.
- The most active sites are ASCC, FZK and NDGF
- Mostly traffic from CMS and Atlas
22630
- (2007.05.30) FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo: atlas, cms
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
Reason: he prob arise from apt autoupdate that have been turn on prod fts server, and upgrade to latest release tag. downgrade to old prod release, say 2.2.6, of transfer agent does solve the prob temporary, however, some star channels are crash always and have limited information telling the root cause.
Later, by analyze the rpm db and extra those pkg upgrade after May 25, we're able to solve the prob by reinstalling those old rpm pkg. simple data transfer testing pass now, with star or specific channel. i am closing the ticket now. sorry for being this delay, to avoid falling into same prob, we switch off apt autoupdate now.
22611
- (2007.05.29) FTS transfer - SRM problem on site
RAL-LCG2 (
moved to 2007.06.01)
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.06.01)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.06.01)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.06.01)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.06.01)
30 May 2007
- 1 tickets opened, 5 moved 0 solved
- RAL: The RAL-LCG2 FTS will be unavailable between 09:00 and 13:00 GMT+1 on May 30 for Oracle and kernel updates. Active transfers will be drained from 08:00 GMT+1 before the service is taken offline.
- NDGF: Due to a software upgrade, the NDGF-T1 SRM service will be out of production from 08:00 UTC (10:00 CET) on Wednesday, May 30th to approx. 12:00 UTC.
- SARA: At 09:00 CET the oracle server at SARA will be down for maintenance. As a results the FTS on mu8.matrix.sara.nl and LFC at mu11.matrix.sara.nl will not be available. We expect that this will not take long.
- Transfer ranging from 200 to 1300 Mb/s, averaging around 780MB/s.
- The most active sites areIN2PCC, ASCC and CNAF, FNAL
- Mostly traffic from Atlas and CMS
22630
- (2007.05.30) FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.05.31)
22611
- (2007.05.29) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.05.31)
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.05.31)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.31)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.31)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.31)
29 May 2007
- 2 tickets opened, 3 moved 0 solved
- IN2P3: The cclcgseli02.in2p3.fr SE will be unavailable and SRM SE will run in a degraded mode ccsrm.in2p3.fr
- Transfer ranging from 380 to 1550 Mb/s, averaging around 850MB/s.
- The most active sites areIN2PCC, ASCC and FZK
- Mostly traffic from Atlas and CMS
22611
- (2007.05.29) FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.05.30)
22607
- (2007.05.29) FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.05.30)
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.30)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.30)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.30)
25 May 2007
- 0 tickets opened, 3 moved 0 solved
- Transfer ranging from 640 to 920 Mb/s, averaging around 780MB/s.
- The most active sites areIN2PCC, BNL and FZK
- Mostly traffic from Atlas and CMS
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
-------------------
[41] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 550 550 rfio write failure: No space left on device (error 28 on disksrv-1.cr.cnaf.infn.it)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
24 May 2007
- 0 tickets opened, 3 moved 0 solved
- RAL: The RAL-LCG2 Castor instance is currently unavailable, staff are investigating
- SARA: Unfortunately they are still experiencing LFC problems. The LFC seems to crash every so many minutes. They are investigating this problem
- Transfer ranging from 800 to 1010 Mb/s, averaging around 840MB/s.
- The most active sites areIN2PCC, BNL and FZK
- Mostly traffic from Atlas and CMS
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.25)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.25)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.25)
23 May 2007
- 0 tickets opened, 3 moved 0 solved
- Transfer ranging from 550 to 1000 Mb/s, averaging around 600MB/s.
- The most active sites areIN2PCC, BNL and FZK
- Mostly traffic from Atlas and CMS
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.24)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.24)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.24)
22 May 2007
- 1 tickets opened, 2 moved 0 solved
- Transfer ranging from 400 to 720 Mb/s, averaging around 600MB/s.
- The most active sites are BNL, IN2PCC and RAL
- Mostly traffic from Atlas and CMS
22276
- (2007.05.22) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.23)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.23)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.23)
21 May 2007
- 0 tickets opened, 4 moved 2 solved
- Transfer ranging from 380 to 840 Mb/s, averaging around 560MB/s.
- The most active sites are BNL, FNAL, FZK and IN2PCC
- Mostly traffic from Atlas and CMS
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo: cms
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM;
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put;
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://srm.grid.sinica.edu.tw:8443/srm/managerv1 ;
id=833488267 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.22)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.22)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas, alice
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://sc.cr.cnaf.infn.it:8443/srm/managerv1 ; id=809567479 call,
no TURL retrieved for srm://sc.cr.cnaf.infn.it/castor/cnaf.infn.it/grid/lcg/atlas/datafiles/misal1_mc12/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797._01571.pool.root.1
Reason:This failures were due to rmmaster melt-down problem in our CASTOR2 instance. This kind of failure is going to be solved with the new LSF plugin in June
18 May 2007
- 0 tickets opened, 11 moved 1 solved and 7 closed
21945
- (2007.05.15) FTS transfer - SRM problem on site CERN-PROD
SOLVED
vo: atlas
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put;
Reason: The atlast0 stager is configured with ~4 transfer slots per diskserver, which is tuned for the export traffic. In addition, yesterday morning the diskservers were rebooted for getting a new network driver into production (which will solve the dropped packet problem). The request above had therefore to wait ~7 minutes before a diskserver could be selected for constructing theTURL. The client had timed out on the SRM 'put' in the meanwhile (after ~3 minutes) and cancelled the request with an advisoryDelete
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2
vo: cms
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM;
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put;
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://srm.grid.sinica.edu.tw:8443/srm/managerv1 ;
id=833488267 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD
CLOSED
vo: atlas, cms, lhcb
[78] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: SOAP-ENV:Client - Operation now in progress
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD
CLOSED
vo: atlas, cms, lhcb
[3] FINAL:TRANSFER: Transfer failed. ERROR globus_l_ftp_control_read_cb: Error while searching for end of reply
Reason: most probably it is because of a strange accumulation of castor-gridftp processes. They are working to avoid this in the future
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD
CLOSED
vo: atlas
[32] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm.cern.ch:8443/srm/managerv1 ; id=799212971 call. Error is specified file(s) does not exist
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD
CLOSED
vo: atlas, lhcb
[35] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: service timeout
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://sc.cr.cnaf.infn.it:8443/srm/managerv1 ; id=809567479 call,
no TURL retrieved for srm://sc.cr.cnaf.infn.it/castor/cnaf.infn.it/grid/lcg/atlas/datafiles/misal1_mc12/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797._01571.pool.root.1
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
CLOSED
vo: atlas
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm-disk.pic.es:8443/srm/managerv1 ;
id=-2139468052 call. Error is RequestFileStatus#-2139468050 failed with error:[ file not found : can't get pnf sId (not a pnfsfile)]
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD
CLOSED
vo: *
*-CERN
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
CERN-*
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
[285] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2162 Timed out (errno=0, serrno=0)
[34] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [srm] ; id=[id] call, no TURL retrieved for [addres]
16 May 2007
- 0 tickets opened, 11 moved 0 solved
- FZK: Some services provided by GridKa are currently unavailable because of a short power cut.
- SARA: Currently they are experiencing problems with the oracle server. This causes problems with the LFC on mu11.matr.sara.nl and the FTS.
- Transfer ranging from 530 to 800 Mb/s, averaging around 700MB/s.
- The most active sites are FNAL, BNL, IN2PCC and ASCC
- Mostly traffic from CMS and Atlas
21945
- (2007.05.15) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.18)
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.05.18)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD (
moved to 2007.05.18)
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.18)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.18)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.18)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.18)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.18)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.18)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.05.18)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.18)
15 May 2007
- 1 tickets opened, 10 moved 0 solved
- FZK: Short intervention to add media and rearrange the slot arrangement of the tape libraries. Data access that triggers recalls from tape is stalled during this window. dCache writes are buffered on disk and staged out automatically after the intervention. Time: 10:00-14:00 CET / 08:00-1200 UTC
- Transfer ranging from 340 to 800 Mb/s, averaging around 550MB/s.
- The most active sites are FNAL, IN2PCC and ASCC
- Mostly traffic from CMS and Atlas
21945
- (2007.05.15) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.16)
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.05.16)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD (
moved to 2007.05.16)
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.16)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.16)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.16)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.16)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.16)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.16)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC (
moved to 2007.05.16)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.16)
14 May 2007
- 0 tickets opened, 11 moved 0 solved and 1 closed
- Transfer ranging from 180 to 830 Mb/s, averaging around 420MB/s.
- The most active sites are FNAL, IN2PCC and BNL
- Mostly traffic from CMS and Atlas
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.05.15)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD (
moved to 2007.05.15)
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.15)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.15)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.15)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.15)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.15)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.15)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD
CLOSED
vo: atlas
[14] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC (
moved to 2007.05.15)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.15)
11 May 2007
- 0 tickets opened, 11 moved and 0 solved
- Transfer ranging from 370 to 560 Mb/s, averaging around 450MB/s.
- The most active sites are CNAF, IN2PCC, BNL
- Mostly traffic from Atlas
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2
vo: cms
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM;
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put;
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://srm.grid.sinica.edu.tw:8443/srm/managerv1 ;
id=833488267 call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD
vo: atlas, cms, lhcb
[78] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: SOAP-ENV:Client - Operation now in progress
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD
vo: atlas, cms, lhcb
[3] FINAL:TRANSFER: Transfer failed. ERROR globus_l_ftp_control_read_cb: Error while searching for end of reply
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD
vo: atlas
[32] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm.cern.ch:8443/srm/managerv1 ; id=799212971 call. Error is specified file(s) does not exist
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD
vo: atlas, lhcb
[35] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: service timeout
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://sc.cr.cnaf.infn.it:8443/srm/managerv1 ; id=809567479 call,
no TURL retrieved for srm://sc.cr.cnaf.infn.it/castor/cnaf.infn.it/grid/lcg/atlas/datafiles/misal1_mc12/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797._01571.pool.root.1
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD
vo: atlas
[14] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
vo: atlas
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm-disk.pic.es:8443/srm/managerv1 ;
id=-2139468052 call. Error is RequestFileStatus#-2139468050 failed with error:[ file not found : can't get pnf sId (not a pnfsfile)]
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD
vo: *
*-CERN
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
CERN-*
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
[285] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2162 Timed out (errno=0, serrno=0)
[34] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [srm] ; id=[id] call, no TURL retrieved for [addres]
10 May 2007
- 1 tickets opened, 10 moved and 0 solved
- RAL: Following a crash of one of the more important servers this morning, Castor is up but not fully usable at the moment due to problems with blockage of the LSF queue manager. They expect to be back to normal service late this morning.
- Transfer ranging from 450 to 730 Mb/s, averaging around 550MB/s.
- The most active sites are PIC, CNAF, IN2PCC, BNL and ASGC
- Mostly traffic from from CMS and Atlas
21723
- (2007.05.10) FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.05.11)
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD (
moved to 2007.05.11)
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.11)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.11)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.11)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.11)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.11)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.11)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.11)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.05.11)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.11)
9 May 2007
- 1 tickets opened, 9 moved and 0 solved
- IN2P3: HSM will be stopped for maintainance purpose from 9am to 11am CEST. This means writing into SRM-tape will still be available with no restriction and transfers/jobs that read files from tapes will be suspended until the system is restarted.
- CERN-PROD: The service on the SRM endpoint srm.cern.ch is currently severely degraded. They will have to stop it, to investigate the situation.
- Transfer ranging from 370 to 980 Mb/s, averaging around 650MB/s.
- The most active sites are FNAL, CNAF and IN2PCC
- Mostly traffic from from CMS and Atlas
21685
- (2007.05.09) FTS transfer - "Operation now in progress" problem on site CERN-PROD (
moved to 2007.05.10)
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.10)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.10)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.10)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.10)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.10)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.05.10)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.10)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC (
moved to 2007.05.10)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.10)
8 May 2007
- 1 tickets opened, 9 moved and 0 solved +1 closed
- Transfer ranging from 250 to 1050 Mb/s, averaging around 650MB/s.
- The most active sites are FNAL, CNAF, IN2PCC and less from BNL
- Mostly traffic from from CMS and Atlas
21646
- (2007.05.08) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.09)
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.09)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.09)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.09)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.09)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.09)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.09)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.05.09)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD
CLOSED
vo: atlas, cms, lhcb
[16] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.09)
7 May 2007
- 2 tickets opened, 8 moved and 1 solved
- FZK:Due to urgent maintenance work on network routers on Monday, 2007-05-07, between 4:30 and 5:15 UTC all grid services provided by GridKa may be unavailable for short periods.
- Transfer ranging from 390 to 960 Mb/s, averaging around 700MB/s.
- The most active sites are FNAL, CNAF and IN2PCC
- Mostly traffic from from CMS less from Atlas
21592
- (2007.05.07) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.05.08)
21590
- (2007.05.07) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.08)
21467
- (2007.05.03) FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
[45] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
[53] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer
Reason: This was due to a planned network interruption
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.08)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.08)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.05.08)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.08)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC (
moved to 2007.05.08)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.08)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.08)
4 May 2007
- 0 tickets opened, 8 moved and 0 solved
- Transfer ranging from 700 to 980 Mb/s, averaging around 840MB/s.
- The most active sites are BNL, IN2PCC and CNAF
- Mostly traffic from from CMS and Atlas
21467
- (2007.05.03) FTS transfer - SRM problem on site NDGF-T1
vo: atlas
[45] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
[53] FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD
vo: atlas, lhcb
[35] FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: service timeout
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://sc.cr.cnaf.infn.it:8443/srm/managerv1 ; id=809567479 call,
no TURL retrieved for srm://sc.cr.cnaf.infn.it/castor/cnaf.infn.it/grid/lcg/atlas/datafiles/misal1_mc12/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797._01571.pool.root.1
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD
vo: atlas
[14] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
vo: atlas
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm-disk.pic.es:8443/srm/managerv1 ;
id=-2139468052 call. Error is RequestFileStatus#-2139468050 failed with error:[ file not found : can't get pnf sId (not a pnfsfile)]
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD
vo: atlas, cms, lhcb
[16] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD
vo: *
*-CERN
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
CERN-*
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
[285] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2162 Timed out (errno=0, serrno=0)
[34] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [srm] ; id=[id] call, no TURL retrieved for [addres]
3 May 2007
- 2 tickets opened, 8 moved and 2 solved
- NDGF: The NDGF-T1 storage will be unavailable due to a planned network outage 07:00-08:00 UTC
- Transfer ranging from 400 to 1000 Mb/s, averaging around 700MB/s.
- The most active sites are BNL, IN2PCC and FZK
- Mostly traffic from from CMS and Atlas
21467
- (2007.05.03) FTS transfer - SRM problem on site NDGF-T1 (
moved to 2007.05.04)
21464
- (2007.05.03) FTS transfer - "service timeout" - SRM problem on site CERN-PROD (
moved to 2007.05.04)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.04)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.05.04)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.04)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC (
moved to 2007.05.04)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD
SOLVED
vo: cms
[39] FINAL:TRANSFER: Failed on SRM copy: RequestFileStatus#-2146190187 failed with error:
[ at Mon Apr 16 02:12:42 CDT 2007 state Failed : GetStorageInfoFailed :
file exists, cannot write ] %2007-04-16 07:13:00,820 [DEBUG] - 28240816 - Entered SrmUtil::destructor.
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD
SOLVED
vo: atlas, cms, lhcb
[32] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is specified file(s) does not exist
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.04)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.04)
2 May 2007
- 0 tickets opened, 8 moved and 0 solved
- Transfer ranging from 300 to 810 Mb/s, averaging around 500MB/s.
- The most active sites are CNAF, FNAL, BNL and ASGG
- Mostly traffic from from CMS and Atlas
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.03)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.03)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.03)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.05.03)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.05.03)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.03)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.03)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.03)
30 April 2007
- 0 tickets opened, 8 moved and 0 solved
- RAL: Castor@RAL will be down for Mon-Wed while the castor software is upgraded from version 2.1.0-3 to 2.1.2-9. All v1 SRMs will be upgraded at the same time to 2.2.12-1 (still v1).
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.05.02)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.05.02)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.05.02)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.05.02)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.05.02)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.05.02)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.05.02)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.05.02)
27 April 2007
- 0 tickets opened, 9 moved and 1 solved
- Transfer ranging from 380 to 980 Mb/s, averaging around 450MB/s.
- The most active sites are FZK, IN2P3, BNL and CNAF
- Mostly traffic from from CMS and Atlas
21254
- (2007.04.26) FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
[17] FINAL:TRANSFER: Gotten zero-length file! The FTS transfer log of an example failing transfer is attached to the ticket. FTS operations support
Reason: This error occurs when the CERN FTS checks the size of the source file in CASTOR, before the actual transfer is started. This has nothing to do with NDGF-T1
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX
vo: atlas, lhcb
[26] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
[27] FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer; also failing to do 'advisoryDelete' on target.
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
vo: atlas, alice
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://sc.cr.cnaf.infn.it:8443/srm/managerv1 ; id=809567479 call,
no TURL retrieved for srm://sc.cr.cnaf.infn.it/castor/cnaf.infn.it/grid/lcg/atlas/datafiles/misal1_mc12/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797/
misal1_mc12.007422.singlepart_singlepi_et10.digit.RDO.v12003103_tid007797._01571.pool.root.1
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD
vo: atlas
[14] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
vo: atlas
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://srm-disk.pic.es:8443/srm/managerv1 ;
id=-2139468052 call. Error is RequestFileStatus#-2139468050 failed with error:[ file not found : can't get pnf sId (not a pnfsfile)]
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD
vo: cms
[39] FINAL:TRANSFER: Failed on SRM copy: RequestFileStatus#-2146190187 failed with error:
[ at Mon Apr 16 02:12:42 CDT 2007 state Failed : GetStorageInfoFailed :
file exists, cannot write ] %2007-04-16 07:13:00,820 [DEBUG] - 28240816 - Entered SrmUtil::destructor.
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD
vo: atlas, cms, lhcb
[32] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is specified file(s) does not exist
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD
vo: atlas, cms, lhcb
[16] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD
vo: *
*-CERN
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
CERN-*
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
[285] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2162 Timed out (errno=0, serrno=0)
[34] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [srm] ; id=[id] call, no TURL retrieved for [addres]
26 April 2007
- 1 tickets opened, 8 moved and 0 solved
- Transfer ranging from 600 to 900 Mb/s, averaging around 700MB/s.
- The most active sites are FNAL, IN2P3, BNL and CNAF
- Mostly traffic from from CMS and Atlas
21254
- (2007.04.26) FTS transfer - SRM problem on site NDGF-T1
(
moved to 2007.04.27)
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.04.27)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.27)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.27)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.27)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.27)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.27)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.27)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.27)
25 April 2007
- 1 tickets opened, 7 moved and 0 solved
- Transfer ranging from 110 to 990 Mb/s, averaging around 450MB/s.
- The most active sites are BNL, FNAL, IN2P3 and ASGG
- Mostly traffic from from CMS and Alice
21191
- (2007.04.25) FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.04.26)
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.26)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.26)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.26)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.26)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.26)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.26)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.26)
24 April 2007
- 1 tickets opened, 6 moved and 0 solved
- Transfer ranging from 200 to 460 Mb/s, averaging around 300MB/s.
- The most active sites are BNL, IN2P3 and FZK
- Mostly traffic from from Atlas less from Alice
21112
- (2007.04.24) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.25)
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.25)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.25)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.25)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.25)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.25)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.25)
23 April 2007
- 0 tickets opened, 6 moved and 0 solved
- IN2P3: IN2P3-CC LFC will migrate from MySQL to Oracle. There will be an interruption of service from 9:00 to 12:00 CEST
- Transfer ranging from 50 to 130 Mb/s, averaging around 60MB/s.
- The most active sites are BNL and CNAF
- Mostly traffic from from Atlas and Alice
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.24)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.24)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.24)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.24)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.24)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.24)
20 April 2007
- 0 tickets opened, 7 moved and 1 solved
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.21)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.21)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.21)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.21)
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas, cms, lhcb
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call, no TURL retrieved for [addres]
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [srm] ; id=[id] call. Error is CastorStagerInterface.c:2457 Device or resource busy (errno=0, serrno=0)
Reason: They have hardware problems at some of the CMS disks. This make higher the failure rate. They are working to fix it
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.21)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.21)
19 April 2007
- 0 tickets opened, 7 moved and 0 solved
- CERN-PROD: The intervention on CASTOR CMS.
- RAL: It has become essential for them to perform maintenance on the CASTOR database. Service off-line to perform essential db maintenance. Downtime for Castor has been extended until Monday morning.
- Transfer ranging from 70 to 440 Mb/s, averaging around 200MB/s.
- The most active sites are IN2PCC, CNAF, BNL and FZK
- Mostly traffic from from Atlas and Alice
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.20)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.20)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.20)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.20)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.20)
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.20)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.20)
18 April 2007
- 0 tickets opened, 7 moved and 0 solved
- NDGF One NDGF Tier-1 site has serious problems with data pools. It reduces our ability to deliver data. The Atlas project is known to be affected. We expect more specific info by 1300hrs.
- Transfer ranging from 180 to 290 Mb/s, averaging around 200MB/s.
- The most active sites are IN2PCC and CNAF
- Mostly traffic from from Atlas and Alice
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.19)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.19)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.19)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.19)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.19)
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.19)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.19)
17 April 2007
- 2 tickets opened, 6 moved and 1 solved
- IN2P3: In order to migrate IN2P3-CC LFC from MySQL to Oracle, there will be a short interruption of the service from 10:00 to 11:00 (lfc-alice.in2p3.fr, lfc-atlas.in2p3.fr, lfc-dteam.in2p3.fr, lfc-lhcb.in2p3.fr)
- SARA: The HSM at SARA will be down for a brief period of time for maintenance purposes. Data on tape will not be accessible during this period.
- Transfer ranging from 350 to 820 Mb/s, averaging around 470MB/s.
- The most active sites are IN2PCC, FNAL, ASCC and FZK
- Mostly traffic from CMS less from Atlas and Alice
20873
- (2007.04.17) Transfer problems between sites BNL-LCG2 and CERN-PROD (
moved to 2007.04.18)
20872
- (2007.04.17) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.18)
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.18)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.18)
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.18)
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD
SOLVED
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
Reason: In March a pool node of their had a broken raid controller. As a result of this the file system broken down completely end they lost some LHCb disk-only data. The files users are trying to access are among the lost files
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.18)
16 April 2007
- 2 tickets opened, 4 moved and 0 solved
- FZK: Due to urgent maintenance work on network routers on Monday, 2007-04-16, between 6:00 and 7:00 UTC all grid services provided by GridKa may be unavailable for short periods.
- NDGF: NDGF-T1 operations is currently affected by a power outage that hit large parts of Copenhagen this morning. Several SRM pools are off-line and the files on them are unavailable. These may come back later today, but this is not fully certain. You may experience disturbances in other services; they are not yet sure of the full impact of the problem.
- Transfer ranging from 270 to 670 Mb/s, averaging around 520MB/s.
- The most active sites are FNAL, IN2PCC, ASCC and PIC
- Mostly traffic from CMS less from Atlas
20817
- (2007.04.16) FTS transfer - Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 2007.04.17)
20816
- (2007.04.16) FTS transfer - "file does not exist" problem on site CERN-PROD (
moved to 2007.04.17)
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD (
moved to 2007.04.17)
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.04.17)
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.04.17)
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.17)
13 April 2007
- 1 tickets opened, 3 moved and 0 solved
- Transfer ranging from 230 to 620 Mb/s, averaging around 500MB/s.
- The most active sites are IN2PCC, ASCC and BNL
- Mostly traffic from CMS and Atlas
20756
- (2007.04.13) Transfer problems between sites INFN-T1 and CERN-PROD
vo: atlas, cms, lhcb
[16] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
vo: atlas, cms, lhcb
[30] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call, no TURL retrieved for [addres]
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [srm] ; id=[id] call. Error is CastorStagerInterface.c:2457 Device or resource busy (errno=0, serrno=0)
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD
vo: atlas
[8] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
20790
- (2007.04.13) FTS transfer - SRM problem on site CERN-PROD
vo: *
*-CERN
[21] FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2507 Device or resource busy (errno=0, serrno=0)
[19] FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
CERN-*
[31] FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
[285] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ; id=[id] call. Error is CastorStagerInterface.c:2162 Timed out (errno=0, serrno=0)
[34] FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [srm] ; id=[id] call, no TURL retrieved for [addres]
12 April 2007
- 3 tickets opened, 3 moved and 3 solved
- FNAL: Site will be down until tonight
- Transfer ranging from 300 to 720 Mb/s, averaging around 500MB/s.
- The most active sites are FNAL, IN2PCC, ASCC and BNL
- Mostly traffic from CMS and Atlas
20702
- (2007.04.12) FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.13)
20698
- (2007.04.12) FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo: atlas,cms
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put.
[12] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response:
451 451 Local resource failure: malloc: Cannot allocate memory.
Reason: The root cause arise from round robin dns solution adopted for grid services, that we forgot to change the id of profile and cause inconsistency of backend disk servers added since last week into srm.grid.sinica.edu.tw
20695
- (2007.04.12) FTS transfer - SRM problem on site USCMS-FNAL-WC1
SOLVED
vo: cms
[39] FINAL:TRANSFER: Failed on SRM copy: RequestFileStatus#-2146327719 failed with error:[ at Thu Apr 12 02:51:39 CDT 2007 state Failed : GetStorageInfoFailed : file exists, cannot write]
Reason: Site is down until tonight
20650
- (2007.04.11) Transfer problems between sites TAIWAN-LCG2 and CERN-PROD
SOLVED
vo: cms
[60] FINAL:TRANSFER: Transfer failed. ERROR globus_ftp_control_connect: globus_libc_gethostbyname_r failed
[16] FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
Reason: The root cause arise from round robin dns solution adopted for grid services, that we forgot to change the id of profile and cause inconsistency of backend disk servers added since last week into srm.grid.sinica.edu.tw
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.04.13)
19819
- (2007.03.19) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.13)
11 April 2007
- 1 tickets opened, 4 moved and 1 solved (1 expired)
- Transfer ranging from 500 to 950 Mb/s, averaging around 600MB/s.
- The most active sites are IN2PCC, FNAL, FZK, RAL and BNL
- Mostly traffic from CMS and Atlas
20650
- (2007.04.11) Transfer problems between sites TAIWAN-LCG2 and CERN-PROD (
moved to 2007.04.12)
20633
- (2007.04.10) FTS transfer - SRM problem on site DESY-HH
SOLVED
vo: cms
[25] FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put; also failing to do 'advisoryDelete' on target.
Reason: The SE was cleaned up. The file has now arrived
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.04.12)
19819
- (2007.03.19) FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.04.12)
19762
- (2007.03.16) FTS transfer - SRM problem on site BNL-LCG2
EXPIRED
vo: atlas
CERN-BNL__2007-03-15-1506_4JXMba:FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target
BNL-CERN__2007-03-15-1535_Vt17TR:FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://dcsrm.usatlas.bnl.gov:8443/srm/managerv1 ; id=-2146369314 call. Error is
RequestFileStatus#-2146369313 failed with error:[ at Thu Mar 15 11:35:24 EDT 2007 state Failed : No Route to cell for packet {uoid=<1173972924515:14572>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
10 April 2007
- 2 tickets opened, 4 moved and 2 solved
- INFN: INFN-NAPOLI-ATLAS site has been shut down yesterday April 9 2007, due to an unexpected outage of the cooling system. 2 days of downtime are foreseen to repair the outage
- NDGF:One of the pools in the NDGF-T1 dCache system has gone offline unexpectedly, making certain files unreachable. Write operations should be unaffected. They are working on bringing the pool back - it will probably be up again later today
- Transfer ranging from 150 to 800 Mb/s, averaging around 450MB/s.
- The most active sites are IN2PCC, FNAL, FZK and RAL
- Mostly traffic from CMS and Atlas
20633
- (2007.04.10) FTS transfer - SRM problem on site DESY-HH
(
moved to 2007.04.11)
20618
- (2007.04.10) Transfer problems between sites SARA-MATRIX and CERN-PROD (
moved to 2007.04.11)
20401
- (2007.04.03) Transfer problems between sites
IN2P3-CC and CERN-PROD
SOLVED
vo: atlas, cms, alice
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
----------
FINAL:TRANSFER: Transfer failed. ERROR an end-of-file was reached
---------
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
Reason: All disk space was full, problem fixed
20110
- FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=807404970 call. Error is CastorStagerInterface.c:2457 User unknown (errno=0, serrno=0)
Reason:The 26th of March we had a castor rmmaster meltdown. It affected many transfers, sorry for the inconvenience
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.04.11)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.04.11)
5 April 2007
- 1 tickets opened, 5 moved and 2 solved
- CERN-PROD: 15:00 CEST 13:00 UTC. The CERN castorpublic service will stopped CEST to address a stager database problem affecting the rollback segments. This interruption should last no more than 15 minutes.
- IN2P3-CC: HSM system is down for an unknown duration. SRM/dCache (ccsrm.in2p3.fr) is still available but: 1) writing to TAPE spaces is possible until dCache buffer is full, then all PUT requests will fail 2) reading from TAPE spaces will fail if files are not in cache disk anymore 3) all operations on DISK spaces will work normally. They recommend if it is possible to stop multi-vo test transfers to TAPE since our buffers are already full.
20536
- (2007.04.05) FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: atlas, cms, alice
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <pool-disk-sc3-10> too high : 2.0E8
Reason: They sustained a real big transfer rate all night long and the Gridftp doors went saturated by dead transfers around 5 AM. This is the reason why tranfers hanged. They restarted all the gridftp doors and did the necessary to prevent this situation from happening again.
20401
- (2007.04.03) Transfer problems between sites
IN2P3-CC and CERN-PROD
vo: atlas, cms, alice
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
----------
FINAL:TRANSFER: Transfer failed. ERROR an end-of-file was reached
---------
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
20157
- FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo: atlas, cms
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception:
Pool manager error: No write pool available for <atlas:atlas@osm>
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrma.rl.ac.uk:8443/srm/managerv1 ;
id=622527563 call. Error is CastorStagerInterface.c:2457 Device or resource busy (errno=0, serrno=0)
Reason:Yes the first error (Pool Manager error) will occur everytime an Atlas user attempts to transfer a file to the SE dcache.gridpp.rl.ac.uk as they have filled the space available to them on this SE, we can't remove support for Atlas from the SE as this would stop them reading the files
20110
- FTS transfer - SRM problem on site INFN-T1
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=807404970 call. Error is CastorStagerInterface.c:2457 User unknown (errno=0, serrno=0)
19819
- FTS transfer - SRM problem on site CERN-PROD
vo: *
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://srm-durable-atlas.cern.ch:8443/srm/managerv1 ; id=689505695 call. Error is CastorStagerInterface.c:2457
19762
- FTS transfer - SRM problem on site BNL-LCG2
vo: atlas
CERN-BNL__2007-03-15-1506_4JXMba:FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target
BNL-CERN__2007-03-15-1535_Vt17TR:FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://dcsrm.usatlas.bnl.gov:8443/srm/managerv1 ; id=-2146369314 call. Error is
RequestFileStatus#-2146369313 failed with error:[ at Thu Mar 15 11:35:24 EDT 2007 state Failed : No Route to cell for packet {uoid=<1173972924515:14572>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
4 April 2007
- 0 tickets opened, 6 moved and 1 solved
- RAL: Tape robot unavailable. The control software for the SL8500 tape library has lost contact with the robot. Until they contact an engineer to find the best way to re-establish the connection, there will be no access to tape.
- Transfer ranging from 580 to 1160 Mb/s, averaging around 800MB/s.
- The most active sites are FNAL, FZK, BNL and IN2PCC
- Mostly traffic from CMS, Atlas and Alice
20404
- (2007.04.03) FTS transfer - SRM problem on site PIC
SOLVED
vo: atlas
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
Reason: As indicated in previous EGEE broadcasts and specified on the GOCDB, we are in scheduled downtime starting on monday 2 at 9:00 up to wednesday (today) at 18:00. This is due to a power intervention at the PIC site that forced a shutdown of the machines
20401
- (2007.04.03) Transfer problems between sites
IN2P3-CC and CERN-PROD (
moved to 2007.04.05)
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.04.05)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.05)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.04.05)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.04.05)
3 April 2007
- 2 tickets opened, 5 moved and 1 solved
- RAL: Due to emergency engineering work that needs to be carried out on the SL8500 tape robot, there will be no access to the tape drives from 09:00-12:00 (UK time) on the 3rd.
- NDGF: Some of dCache pools will undergo a short maintenance at 14:00 CEST - 14:15 CEST, in the meantime some data will be unaccessible. The core services will be unavailable for a short timespan between 16:00 - 16:45 CEST today due to a switch upgrade.
- Transfer ranging from 410 to 1190 Mb/s, averaging around 750MB/s.
- The most active sites are FNAL, CNAF, FZK, BNL and IN2PCC
- Mostly traffic from CMS, Atlas and Alice
20404
- (2007.04.03) FTS transfer - SRM problem on site PIC
(
moved to 2007.04.04)
20401
- (2007.04.03) Transfer problems between sites
IN2P3-CC and CERN-PROD (
moved to 2007.04.04)
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.04.04)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.04)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.04.04)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
SOLVED
vo: cms
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ;
id=[id] call. Error is RequestFileStatus#[id] failed with error:[ [date] state Failed : file not found : path [path] not found]
Reason: The files did not exist at FNAL previously -- they are now present and srmcp succeeds
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.04.04)
2 April 2007
- 0 tickets opened, 5 moved and 0 solved
- PIC: Will be down from monday the 2nd April at 9:00 until wednesday the 4th April at 18:00. Due to the yearly electrical maintenance, all of the equipment has to be switched off during this period.
- NDGF: dCache will be updated 09:00 UTC and the data access shall be unavailable during the break. The duration of the break is estimated to be 1 to 10 minutes.
- ASGC:Short intervention for LFC server upgrade. maintenance plan to be complete in 15min. DDM activity base on lfc catalogue might be affected. maintenace start from 09:30 and stop at 09:45
- CERN: (UTC - 06:00-08:30 CEST - 08:00-10:30) Two interventions will take place at CERN:
1) CERN Castor nameserver to migrate to new hardware.
All CERN castor instances will be unavailable during this time. lxbatch will be paused during this time.
2) Three network switches hosting grid services will be replaced. Each switch downtime is expected to last thirty minutes.
This will be done on a rolling basis such that grid services: WMS, RB, CE, SRM, BDII,
VOMS, FTS T0<->T1,Experiment Front End Services will all suffer service degradations.
The FTS T0<->T2 will experience downtime for thirty minutes since it is hosted only on one switch.
- Transfer ranging from 0 to 610 Mb/s, averaging around 300MB/s.
- The most active sites are RAL, FNAL, PIC and FZK
- Mostly traffic from CMS, Atlas and Alice
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.04.03)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.03)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.04.03)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.04.03)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.04.03)
30 March 2007
- 1 tickets opened, 6 moved and 2 solved
- IN2P3: GRIF machines at LLR will be showtdown for software upgrade and SE re-deployment. Sheduled downtime from 30 Mars 9:00 until 19:00 The serveces affected: polgrid1.in2p3.fr (CE); polgrid2.in2p3.fr (SE); polgrid4.in2p3.fr (SE).
- Transfer ranging from 100 to 800 Mb/s, averaging around 400MB/s.
- The most active sites are IN2PCC, FNAL, ASCC, PIC and CNAF
- Mostly traffic from CMS, Atlas and Alice
- TRIUMF: Status was change to 'inactive' on CERN-TRIUMF because of disk cleaning. Will be activate when the situation improves.
20296
- Transfer problems between sites TAIWAN-LCG2 and CERN-PROD
SOLVED
vo: atlas, cms
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory.
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.04.02)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.04.02)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put; also failing to do 'advisoryDelete' on target
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <dpool04_1> too high : 1.4E8
FINAL:TRANSFER: Getting filesize at destination failed twice!the server sent an error response: 553 553 Permission denied, reason: CacheException(rc=10006;msg=Pnfs request timed out)
_Reason: The disk is full _
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.04.02)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.04.02)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.04.02)
29 March 2007
- 0 tickets opened, 6 moved and 0 solved
- RAL: Castor system will be unavailable between 9.00 and 14.00 UTC+1 (BST)
- FNAL & BNL:The two USLHCNET New York - Chicago circuits will be migrated from the Force10s to the new Ciena CIs. This migration is scheduled for Thursday,March 29, starting from 14:00 GMT/15:00 UTC for 8 hours. The circuits will be migrated one at a time, and the migration shouldn't affect the CERN-FNAL or CERN-BNL traffic at any time. Each individual circuit shouldn't be down for more than 20 minutes during the migration period, and at any time one of the two circuits will be available
- Transfer ranging from 120 to 800 Mb/s, averaging around 300MB/s.
- The most active sites are IN2PCC, CNAF, ASCC and PIC
- Mostly traffic from CMS, Atlas and Alice
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.03.30)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.03.30)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
(
moved to 2007.03.30)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.03.30)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.30)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.03.30)
28 March 2007
- 1 tickets opened, 8 moved and 3 solved
- INFN pps-fts.cnaf.infn.it and lfc-sc.cr.cnaf.infn.it will be down for the full day at Wed 28 March 2007 to install the new Oracle cluster backend.
- NDGF, SARA, PIC: Between 9.00AM and 10.00AM CET, the configuration of the two CERN's LHCOPN routers will be modified in order to implement a new routing functionality. The intervention should be transparent and there will be no service interruption, except in case of unexpected problem.
- Transfer ranging from 390 to 760 Mb/s, averaging around 500MB/s.
- The most active sites are IN2PCC, CNAF, RAL and PIC
- Mostly traffic from CMS, Atlas and Alice
20217
- FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR an end-of-file was reached
Reason: This was due to 2 pools full + all GFTP servers stuck. Fixed now
20160
- FTS transfer - SRM problem on site SARA-MATRIX
SOLVED
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception:
Pool manager error: Best pool <bee11_1> too high : 2.0E8
Reason: At the moment the ATLAS disk-only pools are completely full. That is what is indicated with this error message. We expect to get more storage next week and we will put this online with the greatest priority
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.03.29)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.03.29)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
(
moved to 2007.03.29)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.03.29)
19816
- FTS transfer - SRM problem on site FZK-LCG2
SOLVED
vo: atlas, lhcb
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 500 500 java.lang.reflect.InvocationTargetException: <stor>
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
CERN-GRIDKA__2007-03-19-0343_nRWaLl FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection.
CERN-GRIDKA__2007-03-19-0305_hJtB13 FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out).%
Reason: Most of the mentioned problems seems to be related to gridftp-problems. The gridftp-doors have many connections in state CLOSE_WAIT and they have to restart these doors every now and then. The problems are known and the dCache developpers are working on it
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.29)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.03.29)
27 March 2007
- 2 tickets opened, 7 moved and 1 solved
- RAL: RAL-LCG2 CE lcgce01.gridpp.rl.ac.uk will be down for maintenance from 11am until 12 noon. The LFC at RAL (lfc.gridpp.rl.ac.uk) is currently off line, as a urgent maintenance procedure was needed (12:30 UTC).
- BNL: dCache SRM service will be shutdown for 30 minutes. This procedure will be done sometime between 3pm and 5pm. The server that supplies this service will be move to a different rack
- TRIUMF: 11:00-12:00 PDT will be down - urgent dCache upgrade to patch level 1.7.0-33 (fixes issues with potential file corruption). Affected service: SRM endpoint at srm.triumf.ca
- Transfer ranging from 380 to 820 Mb/s, averaging around 500MB/s.
- The most active sites are IN2PCC, CNAF, FNAL and PIC
- Mostly traffic from CMS
20160
- FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.03.28)
20157
- FTS transfer - SRM problem on site
RAL-LCG2
(
moved to 2007.03.28)
20110
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.03.28)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
(
moved to 2007.03.28)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.03.28)
19816
- FTS transfer - SRM problem on site FZK-LCG2
(
moved to 2007.03.28)
19815
- FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server - CGSI-gSOAP:
Could not find mapping for: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mlassnig/CN=663551/CN=Mario Lassnig ;
also failing to do 'advisoryDelete' on target.
Reason: it was casued by the fact that there were too many srm processes in the srm servers. Fixed after a srm restart.
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.28)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.03.28)
26 March 2007
- 2 tickets opened, 9 moved and 3 solved
- IN2P3: Scehduled downtime from March 26th from 8H00 to March 27th at 8H00. They replace the storage toolkit area hardware to increase the capability.
- IN2P3: The SRM/dCache system will be shutdown at 2PM CET today until 5PM. This is an urgent intervention to apply an important fix.
- SARA: From 9:30 to 10:30 there will be a scheduled downtime to upgrade dcache 1.7.0 to patch level 33.
- Transfer ranging from 50 to 1050 Mb/s, averaging around 600MB/s.
- The most active sites are FNAL, IN2PCC, CNAF and PIC
- Mostly traffic from CMS
20110
- FTS transfer - SRM problem on site INFN-T1(
moved to 2007.03.27)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
(
moved to 2007.03.27)
19819
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 2007.03.23)
19816
- FTS transfer - SRM problem on site FZK-LCG2
(
moved to 2007.03.27)
19815
- FTS transfer - SRM problem on site INFN-T1
(
moved to 2007.03.27)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.27)
19762
- FTS transfer - SRM problem on site BNL-LCG2
(
moved to 2007.03.27)
20096
- Transfer problems between sites DESY-HH and CERN-PROD
SOLVED
vo: cms
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 451 451 Local resource failure: malloc: Cannot allocate memory
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
Reason: All write pools went full over the weekend. Some files that are not needed any longer have been removed and space became available. Additional hardware with disk space is available and will put into operation soon.
19818
- FTS transfer - SRM problem on site SARA-MATRIX
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <bee10_1> too high : 2.0E8
Reason: We have seen a lot of problems with dcache last week. We think we have found the cause for this. The last error is due to the fact that the ATLAS disk-only disk pools are full at the moment.
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD
SOLVED
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception:
Pool manager error: No write pool available for <atlas:atlas@osm>
Reason: Problem in at the RAL end of the channel
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
EXPIRED
vo: cms
FNAL-CERN__2007-03-06-0330_UKQgUf FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get%
FNAL-CERN__2007-03-06-0322_6V86zM FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !%
23 March 2007
- 1 tickets opened, 9 moved and 1 solved
- Transfer ranging from 700 to 1300 Mb/s, averaging around 1050MB/s.
- The most active sites are FNAL, IN2PCC, FZK
- Mostly traffic from Atlas and CMS
20022
- Transfer problems between sites
IN2P3-CC and CERN-PROD
SOLVED
vo: atlas, cms, lhcb
FINAL:TRANSFER: Getting filesize failed. the server sent an error response: 530 530 Authorization Service failed: diskCacheV111.services.authorization.AuthorizationServiceException: authRequestID 1821291682 recevied exception null
Reason: _(IN2P3) They had a wrong dcache version on one of our gridftp servers.(The gridftp server has been reinstalled properly)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put; also failing to do 'advisoryDelete' on target
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <dpool04_1> too high : 1.4E8
FINAL:TRANSFER: Getting filesize at destination failed twice!the server sent an error response: 553 553 Permission denied, reason: CacheException(rc=10006;msg=Pnfs request timed out)
19819
- FTS transfer - SRM problem on site CERN-PROD
vo: *
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://srm-durable-atlas.cern.ch:8443/srm/managerv1 ; id=689505695 call. Error is CastorStagerInterface.c:2457
19818
- FTS transfer - SRM problem on site SARA-MATRIX
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <bee10_1> too high : 2.0E8
Reason: We have seen a lot of problems with dcache last week. We think we have found the cause for this. The last error is due to the fact that the ATLAS disk-only disk pools are full at the moment.
19816
- FTS transfer - SRM problem on site FZK-LCG2
vo: atlas, lhcb
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 500 500 java.lang.reflect.InvocationTargetException: <stor>
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
CERN-GRIDKA__2007-03-19-0343_nRWaLl FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection.
CERN-GRIDKA__2007-03-19-0305_hJtB13 FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out).%
19815
- FTS transfer - SRM problem on site INFN-T1
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server - CGSI-gSOAP:
Could not find mapping for: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mlassnig/CN=663551/CN=Mario Lassnig ;
also failing to do 'advisoryDelete' on target.
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
vo: cms
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres] ;
id=[id] call. Error is RequestFileStatus#[id] failed with error:[ [date] state Failed : file not found : path [path] not found]
19762
- FTS transfer - SRM problem on site BNL-LCG2
vo: atlas
CERN-BNL__2007-03-15-1506_4JXMba:FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target
BNL-CERN__2007-03-15-1535_Vt17TR:FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://dcsrm.usatlas.bnl.gov:8443/srm/managerv1 ; id=-2146369314 call. Error is
RequestFileStatus#-2146369313 failed with error:[ at Thu Mar 15 11:35:24 EDT 2007 state Failed : No Route to cell for packet {uoid=<1173972924515:14572>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception:
Pool manager error: No write pool available for <atlas:atlas@osm>
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
vo: cms
FNAL-CERN__2007-03-06-0330_UKQgUf FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get%
FNAL-CERN__2007-03-06-0322_6V86zM FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !%
22 March 2007
- 1 tickets opened, 10 moved and 2 solved
- TRIUMF: Will be unavailable 07:00 to 15:00 PDT Several changes will occur during this downtime.
- FZK: A rearrangement of the database storage of the dCache environment needs to be addressed immediately. i.e. Today 8:00-10:00 UTC
- SARA: dcache restart 10:00-11:00 UTC.
- Transfer ranging from 20 to 1050 Mb/s, averaging around 500MB/s.
- The most active sites are FNAL, IN2PCC, FZK, BNL, CNAF, NDGF, TRIUMF and ASGG
- Mostly traffic from Atlas and CMS
- GridView problems: no transfers since 10 a.m.?
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2 (
moved to 2007.03.23)
19819
- FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.03.23)
19818
- FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.03.23)
19816
- FTS transfer - SRM problem on site FZK-LCG2 (
moved to 2007.03.23)
19815
- FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.03.23)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.03.23)
19762
- FTS transfer - SRM problem on site BNL-LCG2 (
moved to 2007.03.23)
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD (
moved to 2007.03.23)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.23)
19973
- FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres]; id=[id] call. Error is RequestFileStatus#[id] failed with error:[ path does not exist and user has no permissions to create it] ; also failing to do |advisoryDelete| on target.%
Reason: Endpoint directories had been created with improper permissions. Now corrected.
19863
- FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
Reason: this is arise from the massive data cleaning for cms csa06 dataset,and cause severe load on the stager server. we're now during the stage migrating the dataset for cms to new castor v2 fabrics. and atlas will followup then. so far, the timeout arise from the limitation of old castor v1, that's we're hurrying to migrate the mgmt sys from c1 to c2
21 March 2007
- 0 tickets opened, 10 moved and 0 solved
- Transfer ranging from 20 to 500 Mb/s, averaging around 120MB/s.
- The most active sites are BNL, CNAF, IN2PCC, TRIUMF, SARA and PIC
- Mostly traffic from Atlas
19863
- FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.03.22)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2 (
moved to 2007.03.22)
19819
- FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.03.22)
19818
- FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.03.22)
19816
- FTS transfer - SRM problem on site FZK-LCG2 (
moved to 2007.03.22)
19815
- FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.03.22)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.03.22)
19762
- FTS transfer - SRM problem on site BNL-LCG2 (
moved to 2007.03.22)
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD (
moved to 2007.03.22)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.22)
20 March 2007
- 2 tickets opened, 8 moved and 0 solved
- RAL & BNL: between 8:00 to 8.30 AM CET the connections from CERN to RAL and BNL will be interrupted for 5 minutes to allow the replacement of a module in one of the CERN's LHCOPN router
- Transfer ranging from 430 to 800 Mb/s, averaging around 500MB/s.
- The most active sites are CNAF, FNAL, RAL, BNL, ASGG and FZK
- Mostly traffic from CMS and Atlas
19863
- FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved to 2007.03.21)
19861
- FTS transfer - SRM problem on site TRIUMF-LCG2 (
moved to 2007.03.21)
19819
- FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.03.21)
19818
- FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.03.21)
19816
- FTS transfer - SRM problem on site FZK-LCG2 (
moved to 2007.03.21)
19815
- FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.03.21)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.03.21)
19762
- FTS transfer - SRM problem on site BNL-LCG2 (
moved to 2007.03.21)
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD (
moved to 2007.03.21)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.21)
19 March 2007
- 4 tickets opened, 4 moved and 0 solved
- CNAF: lfcserver.cnaf.infn.it temporarily down because of a long upgrade procedure on the DB.
- Transfer ranging from 170 to 920 Mb/s, averaging around 530MB/s.
- The most active sites are CNAF, BNL, RAL, FNAL, IN2PCC and FZK
- Mostly traffic from CMS and Atlas
19819
- FTS transfer - SRM problem on site CERN-PROD (
moved to 2007.03.20)
19818
- FTS transfer - SRM problem on site SARA-MATRIX (
moved to 2007.03.20)
19816
- FTS transfer - SRM problem on site FZK-LCG2 (
moved to 2007.03.20)
19815
- FTS transfer - SRM problem on site INFN-T1 (
moved to 2007.03.20)
19814
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 2007.03.20)
19762
- FTS transfer - SRM problem on site BNL-LCG2 (
moved to 2007.03.20)
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD (
moved to 2007.03.20)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 2007.03.20)
16 March 2007
- 2 tickets opened, 4 moved and 1 solved
- PIC: Yesterday evening the postrgresql DB hosting the PNFS of the dCache sistem at PIC broke down. This affected the SRM-disk service, which has been failing for the whole night. They are now recovering the DB. Until this is finished the SRM-disk service will be down.
- Transfer ranging from 380 to 900 Mb/s, averaging around 630MB/s.
- The most active sites are CNAF, BNL, ASGS, RAL, FNAL, IN2PCC and FZK
- Mostly traffic from CMS and Atlas
19410
- FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port:
java.lang.Exception: Pool manager error: Best pool <pool-disk-sc3-11> too high : 2.0E8;
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 530 530 User Authorization record failed to be retrieved: diskCacheV111.services.authorization.AuthorizationServiceException:
Exception thrown by diskCacheV111.services.authorization.KPWDAuthorizationPlugin: java.io.FileNotFoundException: /opt/d-cache/etc/dcache.kpwd (No such file or directory)
Reason:1.Best pool <pool-disk-sc3-11> too high: means the pools are full, we are currently cleaning them.
2.AuthorizationServiceException:there was a configuration error on 2 of our GFTP servers. this is fixed now.
19762
- FTS transfer - SRM problem on site BNL-LCG2
vo: atlas
CERN-BNL__2007-03-15-1506_4JXMba:FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target
BNL-CERN__2007-03-15-1535_Vt17TR:FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on httpg://dcsrm.usatlas.bnl.gov:8443/srm/managerv1 ; id=-2146369314 call. Error is
RequestFileStatus#-2146369313 failed with error:[ at Thu Mar 15 11:35:24 EDT 2007 state Failed : No Route to cell for packet {uoid=<1173972924515:14572>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
19706
- Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD
vo: cms
FINAL:TRANSFER: Getting filesize failed. the server sent an error response: 535 535 Authentication failed: GSSException: Failure unspecified at GSS-API level [Caused by: Unknown CA]
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Cannot open port: java.lang.Exception:
Pool manager error: No write pool available for <atlas:atlas@osm>
19313
- FTS transfer - SRM problem on site CERN-PROD
vo: *
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
Failed SRM put on [address]; Error is CastorStagerInterface.c:2457 Device or resource busy
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] call. Error is CastorStagerInterface.c:2438 BAD ERROR NUMBER: 0 (errno=0, serrno=1015)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
vo: cms
FNAL-CERN__2007-03-06-0330_UKQgUf FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get%
FNAL-CERN__2007-03-06-0322_6V86zM FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !%
15 March 2007
- 3 tickets opened, 2 moved and 1 solved
- SARA-MATRIX: Had some problems with the oracle database and dCache (Oracle LFC had been shutdown, entire service restarted, srm.grid.sara.nl will be in maintenance until 11:00 CET)
- IN2P3-CC: site will be unreachable from 8:00 to 14:00, due to network operations
- Transfer ranging from 200 to 1025 Mb/s, averaging around 630MB/s.
- The most active sites are CNAF, FZK, BNL, RAL, IN2P3 and ASGS
- Mostly traffic from CMS and Atlas
19706
- Transfer problems between sites USCMS-FNAL-WC1 and CERN-PROD (
moved to 16.03.2007)
19705
- Transfer problems between sites
RAL-LCG2 and CERN-PROD (
moved to 16.03.2007)
19313
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 16.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 16.03.2007)
19674
- FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo: atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrma.rl.ac.uk:8443/srm/managerv1 ;
id=873661926 call, no TURL retrieved for srm://ralsrma.rl.ac.uk/castor/ads.rl.ac.uk/test/grid/hep/disk0tape1/dteam/mph45/test/2007/03/14/LQbOB1jkjf/03:01:36
FINAL:TRANSFER: Destination and source file sizes don't match!!
14 March 2007
- 2 tickets opened, 3 moved and 3 solved
- SARA-MATRIX: were experiencing problems with srm about 11 a.m.
- Transfer ranging from 180 to 1150 Mb/s, averaging around 580MB/s.
- The most active sites are CNAF, FNAL, BNL, ASGS and IN2PCC
- Mostly traffic from CMS and Atlas
19313
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 14.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 14.03.2007)
19674
- FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo: atlas, lhcb
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://dcache.gridpp.rl.ac.uk:8443/srm/managerv1 ;
id=-2147214665 call. Error is RequestFileStatus#-2147214664 failed with error:
[ at Wed Mar 14 06:26:11 GMT 2007 state Failed : can not prepare to put : org.dcache.srm.scheduler.FatalJobFailure: transfer protocols not supported]; also failing to do 'advisoryDelete' on target
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ralsrma.rl.ac.uk:8443/srm/managerv1 ;
id=873661926 call, no TURL retrieved for srm://ralsrma.rl.ac.uk/castor/ads.rl.ac.uk/test/grid/hep/disk0tape1/dteam/mph45/test/2007/03/14/LQbOB1jkjf/03:01:36
FINAL:TRANSFER: Destination and source file sizes don't match!!
Reason: the "transfer protocols not supported" error is due to gridFTP doors falling over. They have been restarted
19314
- FTS transfer - SRM problem on site INFN-T1
SOLVED
vo: atlas
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
(added at 06.03.2007)
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=833422608 call. Error is CastorStagerInterface.c:728 sURL_to_path(srm://castorsrm.cr.cnaf.infn.it) failed: Success (errno=0, serrno=0)
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=831325375 call. Error is CastorStagerInterface.c:2457 Device or resource busy (errno=0, serrno=0)
Reason: this was due to a problem of the rmmaster deamon of CASTOR2. Unfortynately it will happen in the future, until a known CASTOR limitation is removed by the developers.
19672
- FTS transfer - SRM problem on site NDGF-T1
SOLVED
vo: atlas
CERN-NDGF__2007-03-14-0616_swhtw1 FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.%
CERN-NDGF__2007-03-14-0637_uNstD4 FINAL:SRM_DEST: Failed on SRM put: Failed To Put SURL. Error in srm__put: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.%
CERN-NDGF__2007-03-14-0313_N3IKiR FINAL:TRANSFER: Transfer failed. ERROR the se rver sent an error response: 530 530 Authorization Service failed: diskCacheV11 1.services.authorization.AuthorizationServiceException: authRequestID 169930515 4 Message to gPlazma timed out for authentification of /O=Grid/O=NorduGrid/OU=f ys.uio.no/CN=Adrian Taga
Reason: Temporary problems with the SRM interface - should be OK now
13 March 2007
Daily Report:
- 0 tickets opened, 3 moved and 0 solved
- Transfer ranging from 50 to 1200 Mb/s, averaging around 630MB/s.
- The most active sites are CNAF, BNL, FZK, FNAL, ASGS and RAL
- Mostly traffic from CMS and Atlas
- There was a broken 10G module on one of the two LGC routers connecting to the Tier1s. fixed from 7:30 a.m.
19313
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 14.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1
(
moved to 14.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 14.03.2007)
12 March 2007
Daily Report:
- 0 tickets opened, 3 moved and 0 solved
- Transfer ranging from 16 to 50 Mb/s, averaging around 25MB/s.
- The most active sites are RAL, IN2P3
- Mostly traffic from CMS
- Connection proplems on the T1 sites.
19313
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 13.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1
(
moved to 13.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 13.03.2007)
9 March 2007
Daily Report:
- 0 tickets opened, 3 moved and 0 solved
- Transfer ranging from 170 to 420 Mb/s, averaging around 260MB/s per day.
- The most active sites are FNAL, CNAF, RAL.
- Mostly traffic from Atlas and CMS
19313
- FTS transfer - SRM problem on site CERN-PROD
vo: *
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
Failed SRM put on [address]; Error is CastorStagerInterface.c:2457 Device or resource busy
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] call. Error is CastorStagerInterface.c:2438 BAD ERROR NUMBER: 0 (errno=0, serrno=1015)
19314
- FTS transfer - SRM problem on site INFN-T1
vo: atlas
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
(added at 06.03.2007)
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=833422608 call. Error is CastorStagerInterface.c:728 sURL_to_path(srm://castorsrm.cr.cnaf.infn.it) failed: Success (errno=0, serrno=0)
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsrm.cr.cnaf.infn.it:8443/srm/managerv1 ;
id=831325375 call. Error is CastorStagerInterface.c:2457 Device or resource busy (errno=0, serrno=0)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
vo: cms
FNAL-CERN__2007-03-06-0330_UKQgUf FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get%
FNAL-CERN__2007-03-06-0322_6V86zM FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !%
8 March 2007
Daily Report:
- 1 tickets opened, 6 moved and 4 solved
- Transfer ranging from 40 to 550 Mb/s, averaging around 150MB/s per day.
- The most active sites are FNAL, CNAF, IN2PCC.
- Mostly traffic from Atlas and CMS.
19313
- FTS transfer - SRM problem on site CERN-PROD
(
moved to 9.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1 (
moved to 9.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1
(
moved to 9.03.2007)
19353
- FTS transfer - SRM problem on site FZK-LCG2
SOLVED
vo: atlas, cms, lhcb
CERN-GRIDKA__2007-03-06-0437_y3fRTr FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
CERN-GRIDKA__2007-03-06-0642_ZatGja FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
GRIDKA-CERN__2007-03-06-0501_6rjUCX FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !
GRIDKA-CERN__2007-03-06-0522_ZLPked FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success
Reason: At the moment they have a very unstable SRM. SRM is dying after some hours and they have to restart it. dCache support knows about the problem and is working on it.
19088
- FTS transfer - problem on BNL SRM
SOLVED
site:BNL-LCG2
vo:atlas
Reason: BNL is in downtime since 23rd February;BNL-LCG2 will be in downtime until 16th March 2007
19415
- FTS transfer - Transfer and SRM problems on site PIC
SOLVED
vo: atlas
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response:
425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <dc002_2> too high : 2.0E8
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
Reason: Currently Atlas is runing out of disk space. This is a known error in dcache when space is not available. We have just received new disk and hope this will be added soon to the pools. The problem is already known in Atlas. ATLAS transfers to PIC will keep failing until we deploy more disk.ATLAS is aware of this situation.
19402
- FTS transfer - Transfer problems between sites CERN-PROD and
RAL-LCG2
SOLVED
vo: atlas, lhcb
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 530 530 Authorization Service failed: diskCacheV111.services.authorization.AuthorizationServiceException: authRequestID 1725349954 recevied exception null
Reason: This was a problem with dCache at RAL, the gPlazma authentication service was located on the same host as a gridftp door, their automatic monitoring stops the dCache services if it detects an out of memory error - stopping the authentication service. The authentication service has now been moved to a different host where there is less chance of the out of memory error occuring.
7 March 2007
Daily Report:
- 3 tickets opened, 6 moved and 3 solved
- Transfer ranging from 50 to 680 Mb/s, averaging around 400MB/s per day.
- The most active sites are FNAL, CNAF, IN2PCC, FZK, PIC.
- Mostly traffic from Atlas and CMS.
19415
- FTS transfer - Transfer and SRM problems on site PIC (
moved to 8.03.2007 )
19402
- FTS transfer - Transfer problems between sites CERN-PROD and
RAL-LCG2 (
moved to 8.03.2007)
19088
- FTS transfer - problem on BNL SRM
IN PROGRES (
moved to 8.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1 (
moved to 8.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved to 8.03.2007)
19353
- FTS transfer - SRM problem on site FZK-LCG2 (
moved to 8.02.2007)
19410
- FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo: atlas, cms, alice
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Failed To Put SURL. Error in srm__put:
SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 553 553 /pnfs/in2p3.fr/data/atlas/tape/sc4/multi_vo_tests/None/sc4tier0/03/07/T0.A.run007504.RAW._lumi0006._0001._sfo05: Cannot create file: CacheException(rc=10006;msg=Pnfs request timed out)
Failed on SRM get: SRM getRequestStatus timed out on get
Reason: These messages ("pnfs timeout") come from our metadata server which is often overloaded. We planned to upgrade the hadrware on 20th march, until this date we will do our best to avoid such errors. We are aware of this problem and we are currently trying to understand what happens. I will update this ticket as soon as we have some news
19306
- FTS transfer - SRM problem on site PIC
SOLVED
vo:atlas, lhcb, cms
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on Error is RequestFileStatus-2146981398 failed with error:
[ date] state Failed : GetStorageInfoFailed : file exists, cannot write
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping:
SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
(added at 06.03.2007)
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:
Client - CGSI-gSOAP: Error reading token data: Connection reset by peer
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:
Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target
Reason: The srm-disk.pic.es service had problems during the last weekend, from 2nd March at around 13h till 5th March at around 11h. The origin of the problem was that the certificate of one of the dcache pool nodes had expired, and the procedure to take it out of the pool group failed
19308
- FTS transfer - SRM problem on site TAIWAN-LCG2
SOLVED
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://castorsc.grid.sinica.edu.tw:8443/srm/managerv1 ;
id=842663298 call. Error is CastorStagerInterface.c:2578 Device or resource busy (errno=0, serrno=16)
6 March 2007
Daily Report:
- 2 tickets opened, 5 moved and 1 solved
- Transfer ranging from 170 to 620 Mb/s, averaging around 380MB/s per day.
- The most active sites are FNAL, FZK, IN2PCC.
- Mostly traffic from CMS, Atlas and DTeam.
19088
- FTS transfer - problem on BNL SRM
IN PROGRES (
moved on 07.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1 (
moved on 07.03.2007)
19358
- FTS transfer - SRM problem on site USCMS-FNAL-WC1 (
moved on 07.03.2007)
19306
- FTS transfer - SRM problem on site PIC (
moved on 07.03.2007)
19308
- FTS transfer - SRM problem on site TAIWAN-LCG2 (
moved on 07.03.2007)
19353
- FTS transfer - SRM problem on site FZK-LCG2
UNSOLVED (moved on 07.03.2007)
19313
- FTS transfer - SRM problem on site CERN-PROD
SOLVED
vo: *
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] ; id=828835243 call, no TURL retrieved for [addres]
Reason: The Castoratlas MSS backend to the srm-durable-at.cern.ch endpoint was degraded yesterday. This explains SRM failures
5 March 2007
Daily Report:
- 6 tickets opened, 1 moved from last week and 2 solved
- Transfer ranging from 70 to 450 Mb/s, averaging around 200MB/s per day.
- The most active sites are FNAL, CNAF, IN2PCC.
- Mostly traffic from Atlas, CMS and DTeam.
19088 - FTS transfer - problem on BNL SRM
IN PROGRES (moved on 06.03.2007)
19314
- FTS transfer - SRM problem on site INFN-T1 (moved on 06.03.2007)
19313
- FTS transfer - SRM problem on site CERN-PROD (moved to 06.03.2007)
19308
- FTS transfer - SRM problem on site TAIWAN-LCG2 (moved to 06.03.2007)
19306
- FTS transfer - SRM problem on site PIC (moved to 06.03.2007)
19310
- FTS transfer - SRM problem on site
RAL-LCG2
SOLVED
vo:atlas, lhcb
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server.userException - java.rmi.RemoteException: SRM Authorization failed; nested exception is:
org.dcache.srm.SRMAuthorizationException: diskCacheV111.services.authorization.AuthorizationServiceException: authRequestID 600032704No Route to cell for packet {uoid=<1173067075710:4469098>;path=[>gPlazma@local];msg=Tunnel cell >gPlazma@local< not found at >dCacheDomain<}
Failed on SRM get: SRM getRequestStatus timed out on get
FINAL:TRANSFER: Transfer failed. ERROR globus_ftp_control_connect: globus_libc_gethostbyname_r failed
Reason: first one is due to the upgrade to the dCache SEs last Thursday, the new gPlazma service is currently running on a system which is also running a gridftp door. This gridftp door ran out of memory and was shut down automatically by a monitoring script - unfortunately this also shut down the gPlazma service leading to the errors. The services has been restarted and they will be moving the gPlazma service to another system
The second error appears to be a problem at CERN
Third error is due to all their gridftp doors failing over the weekend due to running out of memory
19304
- FTS transfer - SRM problem on site
IN2P3-CC
SOLVED
vo:atlas, lhcb
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 530 530 User Authorization record failed to be retrieved: diskCacheV111.services.authorization.AuthorizationServiceException:
Exception thrown by diskCacheV111.services.authorization.KPWDAuthorizationPlugin: java.io.FileNotFoundException: /opt/d-cache/etc/dcache.kpwd (No such file or directory)
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 530 530 Authorization Service failed: diskCacheV111.services.authorization.AuthorizationServiceException: authRequestID 234405366No Route to cell for packet {uoid=<1173070963721:408604>;path=[>gPlazma@local];msg=Tunnel cell >gPlazma@local< not found at >dCacheDomain<}
2 March 2007
Daily Report:
- Transfer ranging from 70 to 250 Mb/s, averaging around 130MB/s per day.
- The most active sites are BNL, FZK, SARA, CNAF.
- Mostly traffic from Atlas and DTeam.
- Typical errors on sites are
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target
Failed SRM put on [address]; Error is CastorStagerInterface.c:[num] Device or resource busy.
19229
- FTS transfer - SRM problem on site INFN-T1
SOLVED
vo:atlas
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
Failed on SRM get: SRM getRequestStatus timed out on get
Reason: this was due to a problem of the rmmaster deamon of CASTOR2. Unfortynately it will happen in the future, until a known CASTOR limitation is removed by the developers
19088 - problem on BNL SRM
IN PROGRES
site:BNL-LCG2
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP:
Could not open connection !; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP:
Error reading token data: Success; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Failed To Put SURL. Error in srm__put: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do |advisoryDelete| on target.
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
Failed SRM put on httpg://dcsrm.usatlas.bnl.gov:8443/srm/managerv1 ; id=-2147108379 call. Error is
RequestFileStatus#-2147108378 failed with error:[ at Wed Feb 28 20:43:26 EST 2007 state Failed : GetStorageInfoFailed : file exists, cannot write]
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer
Reason: BNL is in downtime since 23rd February;BNL-LCG2 will be in downtime until 16th March 2007
19009 - problem on
IN2P3 SRM
IN PROGRES
site:IN2P3-CC
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on httpg://ccsrm.in2p3.fr:8443/srm/managerv1 ; id=-2146541886 call. Error is
RequestFileStatus#-2146541885 failed with error:[ at Thu Mar 01 02:46:12 CET 2007 state Failed : GetStorageInfoFailed : file exists, cannot write]
Reason: experience a bottneck on PNFS server and, for the moment, can not upgrade the machine
19144 - problem on GRIDKA SRM
IN PROGRES
site:FZK-LCG2
vo:atlas, cms, lhcb
FINAL:SRM_SOURCE: Failed on SRM get: Failed SRM get on [addres]. Error is RequestFileStatus#-[id] failed with error:[ at [date] state Failed : No Route to cell for packet {uoid=<[id]>;path=[>PinManager@local];msg=Tunnel cell >PinManager@local< not found at >dCacheDomain<}
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Server.userException - java.rmi.RemoteException: SRM Authorization failed; nested exception is: org.dcache.srm.SRMAuthorizationException: diskCacheV111.services.authorization.AuthorizationServiceException: authRequestID [id] Message to gPlazma timed out for authentification of [info] and role null
FINAL:SRM_SOURCE: Failed on SRM get: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Connection reset by peer
FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success
FINAL:SRM_SOURCE: Failed on SRM get: SRM getRequestStatus timed out on get
FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection
Reason: The utility Domain crashed because "Out of Memory" error
19157 problems whith PIC dcash
IN PROGRES
site:PIC
vo:atlas
FINAL:TRANSFER: Transfer failed. ERROR the serv er sent an error response: 425 425 Cannot open port: java.lang.Exception: Pool manager error: Best pool <dc002_2> too high : 2.0E8
Reason: Currently Atlas is runing out of disk space. They have just received new disk and hope this will be added soon to the pools.
1 March 2007
19088 - problem on BNL SRM (moved to 2.03.2007)
19089 - CERN-INFN SRM's problems:
site:INFN-T1
vo:atlas, cms
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.%
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
19009 - problem on
IN2P3 SRM (moved to 2.03.2007)
18955 - problems with CERN SRM:
site: *
vo: *
Failed on SRM get: SRM getRequestStatus timed out on get
FINAL:SRM_SOURCE: Failed on SRM get: Failed To Get SURL. Error in srm__get: service timeout.
Failed on SRM get: Failed SRM get on [address] no TURL retrieved for [address]
19086 - dcach problem on CERN-TRIUMF channel
SOLVED
site:TRIUMF-LCG2
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] Error is RequestFileStatus#-[id] failed with error:[ at [date] state Failed : user has no permission to write into path [addres] ] ; also failing to do |advisoryDelete| on target.%
19144 - problem on GRIDKA SRM (moved to 2.03.2007)
19157 problems whith PIC dcash (moved to 2.03.2007)
28 February 2007
18955 - problems with CERN SRM: (moved to 01.03.2007)
19092 - problem on
RAL SRM:
site:RAL-LCG2
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP:
Could not open connection !; also failing to do 'advisoryDelete' on target
19089 - CERN-INFN SRM's problems (moved to 1.03.2007)
19088 - problem on BNL SRM (moved to 1.03.2007)
19009 - problem on
IN2P3 SRM (moved to 01.03.2007)
19086 - dcach problem on CERN-TRIUMF channel (moved to 1.03.2007)
19093 - problems on PIC SRM:
SOLVED
site:PIC
vo:lhcb
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
Reason: One of lhcb diskpools in srm-disk has been down this night because of a HW problem. They are solving this problem
19021 - grid-ftp problem on CERN-GRIDKA channel
SOLVED
site:CERN-PROD, FZK-LCG2
vo:atlas, lhcb
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 426 426 Data connection. data_write() failed: Handle not in the proper state
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 421 421 Timeout (900 seconds): closing control connection.
Reason: gridftp-doors are unreliable. There is a new dCache -patch out which they install shortly
19014 - problem on ASCC SRM
SOLVED
site:TAIWAN-LCG2
vo:atlas
Failed SRM put on [address]; Error is CastorStagerInterface.c:2578 Device or resource busy
Reason:There have been problems with the SRM being sluggish, and note that dCache is down for upgrades today
27 February 2007
19019 - problem on TRIUMF SRM
site:TRIUMF-LCG2
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] Error is RequestFileStatus#-[id] failed with error:[ at Mon Feb 26 09:53:29 EST 2007 state Failed : user has no permission to write into path [addres] ] ; also failing to do |advisoryDelete| on target
19014 - problem on ASCC SRM (moved to 28.02.2007)
19004 - problem on CERN SRM and INFN SRM
SOLVED
site:INFN-T1, CERN-PROD
vo:cms
FINAL:TRANSFER: Transfer failed. ERROR the server sent an error response: 425 425 Can't open data connection. timed out() failed
Reason: (Responsible Unit: ROC_Italy) it was caused by a falilure of the garbage collector occurred on the dteam disk servers. now it is fixed
19009 - problem on
IN2P3 SRM
SOLVED
site:IN2P3-CC
vo:atlas
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
Reason: Yes, we have some difficulties to maintain the throughput because PUT requests stay in queue for a long time. We are investigating. An upgrade of dCache is scheduled on March 20th which will fix the bottleneck on our PNFS server.I close the ticket, but feel free to reopen it after March 20th if you still have the problem
19008 - problem on CERN SRM
SOLVED
site:CERN-PROD
vo:atlas
1. Failed on SRM get: Failed SRM get on [address] no TURL retrieved for [address]
2. Failed on SRM get: SRM getRequestStatus timed out on get (26.02.2007 from 4 p.m until 5.30 p.m)
19021 - grid-ftp problem on CERN-GRIDKA channel
SOLVED (moved to 28.02.2007)
FINAL:ABORTED: Operation was aborted (the gridFTP transfer timed out)
Reason: problems with gridftp-doors
26 February 2007
18963 - CERN-PIC SRM's problems (moved to 28.02.07)
18958 - STAR-CERN Castor's problems:
sites: *
vo: cms
FINAL:TRANSFER: Getting filesize failed. an end-of-file was reached
18955 - CERN-BNL Castor's problems: (moved to 28.07.2007)
18963 - CERN-INFN SRM's problems:
SOLVED
site:INFN-T1
vo:atlas, cms, lhcb
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
Reason: it was a problem on the dteam disk servers. now fixed.
18962 - CERN-RAL SRM's problems:
SOLVED
site:RAL-LCG2
vo:atlas, cms, lhcb
FINAL:SRM_DEST: Failed on SRM put: Failed SRM put on [addres] call. Error is RequestFileStatus#-[id] failed with error:[ can not prepare to put : org.dcache.srm.scheduler.FatalJobFailure: transfer protocols not supported] ; also failing to do 'advisoryDelete' on target.
Failed SRM put on [address]; Error is CastorStagerInterface.c:2457 Device or resource busy
FINAL:SRM_DONE_DEST: failing to do 'setDone' on target SRM
Reason: The first error was the result of gridftp doors on disk dCache. SE being overloaded by transfers piling up due to a vo filling up its space. The doors have been restarted and more space has been allocated to the vo.
The second error is from Castor, had been passed onto the Castor at RAL.
There was a problem on ralsrma yesterday due to a logfile overfilling. It was restarted around 4 pm.
23 February 2007
18901 - CERN-RAL Castor's promblems:
UNSOLVED
FTS Channel CERN-RAL Failed SRM put on [address]; Error is CastorStagerInterface.c:2457 Device or resource busy
FINAL:TRANSFER: Destination and source file sizes don't match!
Reason: A potential problem has been identified and the CASTOR and FTS developers are aware of it
18904 - Channel CERN-BNL SRM's problems
SOLVED
FINAL:SRM_DEST: Failed on SRM put: SRM getRequestStatus timed out on put
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Cannot Contact SRM Service. Error in srm__ping: SOAP-ENV:Client - CGSI-gSOAP: Could not open connection !; also failing to do 'advisoryDelete' on target.
FINAL:SRM_DEST: Failed on SRM put: Failed To Put SURL. Error in srm__put: SOAP-ENV:Client - CGSI-gSOAP: Error reading token data: Success; also failing to do 'advisoryDelete' on target.
Reason: This site currently down for scheduled maintenance: Worker node upgrade SL3 -> SL4, subsequent testing.
(20th February 2007 - 18:00 -> 23rd February 2007 - 17:00)
Last edit:
AndreyNechaevskiy on 2008-02-12 - 09:42
Number of topics: 1
Maintainer:
GavinMcCance