Preamble
NOTE: This page, while containing some useful information, is very out of date. Please see
https://opensciencegrid.org/networking/
for more current information about deploying, installing and maintaining perfSONAR for WLCG/LHCOPN/LHCONE.
During the combined LHCOPN/LHCONE meeting which took place in Washington on 13 & 14 June, 2011, questions were raised about why there were two, albeit supposedly compatible, versions of perfSONAR (perfSONAR-MDM and perfSONAR-PS). US-ATLAs gave very positive feedback on their experience with perfSONAR-PS, and Jason Zurawski (then of Internet2, now of ESnet) gave a presentation which included the US-ATLAS use-case (
link
).The distributed nature of the perfSONAR-ps solution was deemed attractive, as was the ease of installation (ISO-CD) and the work that US-ATLAS has done on visualisation (including alarm generation using Nagios). Following a strong request from ATLAS, it was decided that perfSONAR-ps would be installed in all the LHCOPN Tier0 and Tier1 sites. The Tier1s present were all in favour of the idea, and John Shade agreed to coordinate and drive the roll-out, with Jason Zurawski providing the technical expertise!
Installed Servers
Tier |
Type |
Hostname |
IP address |
Comments |
RAL |
Latency: |
lcgps01.gridpp.rl.ac.uk |
130.246.176.109 |
|
|
Bandwidth: |
lcgps02.gridpp.rl.ac.uk |
130.246.176.110 |
|
CC-IN2P3 |
Latency: |
ccperfsonar2.in2p3.fr |
193.48.99.76 |
|
|
Bandwidth: |
ccperfsonar1.in2p3.fr |
193.48.99.77 |
|
CERN |
Latency: |
perfsonar-lt.cern.ch |
128.142.223.247 |
|
|
Bandwidth: |
perfsonar-bw.cern.ch |
128.142.223.246 |
|
TRIUMF |
Latency: |
ps-latency.lhcmon.triumf.ca |
206.12.9.2 |
|
Bandwidth: |
ps-bandwidth.lhcmon.triumf.ca |
206.12.9.1 |
|
SARA |
Latency: |
perfsonar-latency.grid.surfsara.nl |
145.100.32.31 |
|
|
Bandwidth: |
perfsonar-bandwidth.grid.surfsara.nl |
145.100.32.32 |
|
ASGC |
Latency: |
lhc-latency.twgrid.org |
117.103.105.191 |
|
|
Bandwidth: |
lhc-bandwidth.twgrid.org |
117.103.105.187 |
|
BNL |
Latency: |
lhcperfmon.bnl.gov |
192.12.15.26 |
|
|
Bandwidth: |
lhcmon.bnl.gov |
192.12.15.23 |
|
CNAF |
Latency: |
perfsonar-ow.cnaf.infn.it |
131.154.254.12 |
|
|
Bandwidth: |
perfsonar-ps.cnaf.infn.it |
131.154.254.11 |
|
NDGF |
Latency: |
perfsonar-ps.ndgf.org |
109.105.124.86 |
|
|
Bandwidth: |
perfsonar-ps2.ndgf.org |
109.105.124.88 |
|
PIC |
Latency: |
picperfsonar-latency.pic.es |
193.109.172.242 |
|
|
Bandwidth: |
picperfsonar-bandwidth.pic.es |
193.109.172.250 |
|
FNAL |
Latency: |
psonar2.fnal.gov |
131.225.205.141 |
|
|
Bandwidth: |
psonar1.fnal.gov |
131.225.205.139 |
|
KIT |
Latency: |
perfsonar2-de-kit.gridka.de |
192.108.47.12 |
|
|
Bandwidth: |
perfsonar-de-kit.gridka.de |
192.108.47.6 |
|
KISTI-GSDC |
Latency: |
ps-gsdc01.sdfarm.kr |
134.75.125.241 |
|
|
Bandwidth: |
ps-gsdc02.sdfarm.kr |
134.75.125.242 |
|
Dashboard
Tom Wlodek's orginal Experimental Independent perfSONAR Dashboard has been replaced by
MaDDash
, developed by ESnet. It includes a number of meshes in addition to the LHCOPN mesh - these can be accessed by clicking on "Dashboard" near the top of the page. More information on MadDash is available
here
.
As before, each site should ensure that its cells are coloured
green!
Meshes
After a fairly long period of manual installations and configurations, WLCG put considerable effort into developing automated packages for perfSONAR installations, and a number of other Twiki pages have sprung up. The most pertinent page is without doubt the
WLCG perfSONAR Deployment
Twiki created by Shawn McKee.
Up-to-date instructions for installing the latest version of perfSONAR and enabling mesh configurations for testing between WLCG sites can be found on
this Twiki page
.
Toolkit Page
The perfSONAR Performance Toolkit (pS-Performance Toolkit) is a customized version of a Knoppix Live-CD bootable disk. The software consists of an ISO disk image which provides a Linux distribution based on
CentOS 5.5, and the perfSONAR software. The toolkit can be downloaded on disk or burned to CD - both versions are identical in terms of functionality, but netinstall has the added benefit of being able to 'yum update' new stuff as it is released.The main toolkit page is here:
http://psps.perfsonar.net/toolkit/
The latest version of the toolkit is 3.3. Netinstall users can do a 'yum update' and then re-boot.
LiveCD users will need to download a new ISO image from
here
.
The RPMs are located
here
.
There are instructions on the main page for enabling the yum repo (automated package management):
http://software.internet2.edu
Instruction Manual
This contains EVERYTHING and may be a little dry, but is
the perfSONAR-PS reference
. A very useful FAQ is available
here
.
Firewall ports
In order for perfSONAR to be really useful for ad-hoc network troubleshooting, a number of firewall ports should be opened. The detailed list is given
here
.
Beware, however, that a yum update of the toolkit (perl-perfSONAR_PS-Toolkit-SystemEnvironment.noarch) will silently overwrite your node's firewall configuration in /etc/sysconfig/iptables! The toolkit developers have promised to try and improve matters in the future, but be sure to always read the release notes!
Top-Tips
It is recommended to run throughput (BWCTL) and latency (OWAMP) tests on separate servers. This is because the throughput tests (iperf) use a lot of CPU, memory, and all of the available resources on a network card, thus negatively impacting the smaller packet streams of the latency tests. Choosing which component to install (Latency or Bandwidth) is a simple 'selectable' option as you configure the host. For servers running the bandwidth tests, we recommend a 10 Gbps NIC. Note that it is not recommended to run 10G bandwidth tests through a firewall.
NTP Servers
Which NTP server to use is site-specific. There are
public NTP servers
available, but it's better to synchronize with servers in close proximity, and that have the lowest possible stratum number (stratum0 = NTP server getting time from a GPS antenna; stratum1 = getting the signal from a stratum0, stratum2 = signal from stratum1, etc.).
Supported Platforms
Hardware
Simple hardware works the best! For example, Internet2 uses 1U Dell servers that are about 5 years old for their testing. The recommendations are:
- Compatible with 32 bit or 64 bit Intel x86 architecture
- CPU Speed Requirement of 2.6 GHz for single core machines
- CPU Speed Requirement of 1.8 GHz for multiple core machines
- Main Memory Requirement of 2 Gigabytes
- Storage Requirement of at least 250 Gigabytes
- RAID level 0+1, or no RAID if possible
- Network Card Speed Requirement of 1Gbps, as a "daughter" card (avoid on-motherboard NIC for performance testing)
- Network cards to support multi-homing (as many as required)
Software and security
The perfSONAR-PS project currently only supports the
CentOS 5 platform for x86 architecture. However, it may be possible to install and use these RPMs on other RHEL binary compatible systems like SL. If you feel like trying, please let us know how you get on!
For instructions on how to enable the yum repo:
http://software.internet2.edu/
In order to configure your servers such that only users in possession of a valid grid certificate can access the web pages, please see Virginie Longo's security guidelines (document attached to this page).
Communities
Although the importance of community strings is yet to be determined, we recommend configuring the following for starters:
LHCOPN,
LHCTier1,
ATLAS,
CMS,
ALICE,
LHCb,
LHCONE.
Links to things of interest
Site Contacts
CERN e-group mailing list:
LHCOPN-perfSONAR-PS-Site-Contacts@cernNOSPAMPLEASE.ch (includes the above names, plus Jason Zurawski and John Shade).
--
JohnShade - 20-Jun-2011