Difference: XrootdProgress20110427 (1 vs. 2)

Revision 22011-04-27 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="CmsXrootdArchitecture"

Progress Report for 27 April 2011

Line: 18 to 18
 
  • No progress on Nagios alerts or RSV probes.

AOB

Added:
>
>
  • RSV probe: This is a simple task for the new version of RSV. FKW is going to get MT in contact with Terrence and see if there's a match. BB suggested another route is having a ugrad take a crack at this; it's a simple task, but not necessarily urgent.
  • BB followed up with Oli to verify AOD files are pinned at FNAL.
  • Action item for BB: Verify that fallback is enabled at all T2 sites.
 \ No newline at end of file

Revision 12011-04-27 - BrianBockelman

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="CmsXrootdArchitecture"

Progress Report for 27 April 2011

Infrastructure Status

  • Most sites stable. Was happy to see that Xrootd Nagios monitoring did 'see' the UCSD network outage last week.
  • Some Wisconsin servers have been going on the blink; I think this is because the switch to HDFS. Things look fine from the redirector though. Working with Will.
  • Florida has joined ITB. They aren't yet passing all the Nagios tests.
  • No contact from MIT yet. They appear to be waiting for Wei to return from vacation?

Action Item Status

  • Monitoring: Basic monitoring has looked good. I can definitely see UCSD group's usage. There's one item that only UCSD shows up in: http://xrootd.t2.ucsd.edu/display?page=xrd_report/link_connections_by_site . How do we get the others there?
  • Abuse detection: I want a plan B. Matevz, if you had access to the redirector logfiles, could you parse this out and build something useful? Basically, I want to look for users/directories that are "too hot".
  • Random file access: Rolled out the Nagios probe to most sites. Everyone is looking good except FNAL. Not sure if FNAL should look good: have they pinned all AOD files yet?
  • No progress on HC.
  • Working on some code improvements requested by Igor to cache GUMS responses. Addressing another bug in HDFS for "dirlist".
  • No updates lately from gWMS point of view. Should we ask them to attend this meeting regularly?
  • No progress on Nagios alerts or RSV probes.

AOB

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback